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Employee Attitudes at The University of Illinois 


Gerald Carter 
University of Illinois 


An attitude survey was recently conducted for all nonacademic em- 
ployees of the University of Illinois. It was sponsored jointly by the Civil 
Service Employee Council and the Office of Nonacademic Personnel. 
Instructions were issued that the signing of questionnaires was optional. 
Approximately 2000 questionnaires were sent out, and 447 were returned. 

The lack of interest in mimeographed forms distributed by campus 
mail probably helps account for only 22% of the questionnaires being 
returned. This makes it impossible to determine the extent to which these 
answers are representative of the total group. It has been recommended 
that future polls be conducted in departmental meetings, allowing each 
employee to select a questionnaire from a stack in order to prevent iden- 
tification. These would then be checked without coaching or discussion 
and deposited in a sealed box to be tabulated by an impartial agent. 

In the following results, the per cent of the total of returned question- 
naires precedes each answer. Some of the questionnaires were not an- 
swered completely. Question 29 was so worded that two answers might 
apply for many employees; hence, the total of the percentages exceeds 
100% in this case. 


Results 


1. If you had a full and free choice, at what age would you want to 
retire from any active full time employment, assuming a reasonable an- 
nual allowance were paid to you for the rest of your life? 18% 50 yrs.; 
27% 55 yrs.; 6% 58 yrs.; 30% 60 yrs.; 10%. 65 yrs.; 14% 68 yrs.; 1% 
some other age, 4% no answer. 

2. If earlier retirement meant changing other features, which would 
you prefer? 51% earlier retirement by employee paying in more; 6% 
earlier retirement by reducing the annuity; 30% retain present retirement 
features; 13% no answer. 

3. Of the three benefits which the Retirement System provides for 
you, which of the following is most important to you? 28% disability 
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benefits; 10% .death benefits; 59% life income following retirement from 
active service; 3% no answer. 

4. Do you favor compulsory retirement after 25 years of University 
service? 40% no; 19% uncertain; 34% yes; 7% no answer. 

5. Would you favor other modifications in the Retirement System? 
35% yes; 26% no; 39% no answer. 

6. Do you understand the provisions as to Disability Benefits? 

a. University Provisions: 23% no; 39% uncertain; 25% yes; 18% 
no answer. 

b. Retirement System provisions: 19% no; 40% uncertain; 26% yes; 
15% no answer. 

7. Are your working conditions clear and understandable, so that you 
know clearly what you are expected to do? 1% rarely; 5% sometimes; 
29% usually; 32% almost always; 27% always; 6% no answer. 

8. In your opinion, how does the average University employee com- 
pare, as to general ability, with the average employee doing similar work 
in the local community? 0% much lower; 1% lower; 38% about the same; 
40% higher; 13% much higher; 8% no answer. 

9. How does the University compare with other places in town as to 
desirability of working conditions? 0% worst place in town; 1% worse 
than average; 31% average; 44% better than average; 15% best place 
in town; 9% no answer. 

10. Do you agree with the University’s present policy to pay the pre- 
vailing rate, which is defined as the wages paid generally in this locality 
to the employees engaged in work of a similar character, where such pre- 
vailing rates can be determined? 16% no; 21% uncertain; 54% yes; 
9% no answer. 

11. Do you believe that in general the wage and salary policy of the 
University is fair? 4% rarely; 18% sometimes; 42% usually; 20% almost 
always; 6% always; 10% no answer. 

12. Do you find your supervisor sufficiently informed to answer your 
questions on University policy, rules, and procedures? 7% rarely; 11% 
sometimes; 27% usually; 25% almost always; 21% always; 9% no an- 
swer. 

13. How do you feel about your opportunities for advancement? 
13% no opportunity ; 27% poor opportunity; 34% fair opportunity; 13% 
good opportunity; 3% very good opportunity; 10% no ariswer. 

14. Is supervision too lax or too severe? 6% too lax; 3% too severe; 
59% O.K.; 32% no answer. 

15. Do you think more people are needed in your department to do 
the work that is required? 2% too many people now; 57% about right 
number; 30% need a few more; 3% need many more; 8% no answer. 

16. Do you believe your own compensation is reasonable: 
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a. In proportion to that of others on this Campus? 3% much too 
low; 40% too low; 55% about right; 0% too high; 0% much too high; 
_ 2% no answer. 

b. In proportion to work similar to yours off the Campus? 5% much 
too low; 28% too low; 51% about right; 1% too high; 0% much too high; 
15% no answer. 

17. Do you believe that the attitude of the University administration 
is: 8% pro-union; 59% neutral and unbiased; 10% anti-union; 23% no 
answer. 

18. If consideration should ever be given to returning University 
Civil Service employees to the State Civil Service Commission, what 
would be your preference? 41% strongly in favor of University Civil 
Service; 7% slightly in favor of University Civil Service; 18% no pref- 
erence; 7% slightly in favor of State Civil Service; 13% strongly in favor 
of State Civil Service; 14% no answer. 

19. Have you taken any part in the recently inaugrated Training and 
Educational Program? 16% evening classes; 10% special supervisory 
courses; 6% regular University classes; 68% no answer. 

20. Should this program be further expanded? 2% no; 18% uncer- 
tain; 40% yes; 40% no answer. 

21. What course or subject would you like to see given? Air con- 
ditioning, art appreciation, bacteriology, chemistry, dietetics, business 
English, nursing ethics, home nursing, materia medica advanced mathe- 
matics, personnel relations, elements of supervision, applied psychology, 
typing and shorthand, University office practice. 

22. Do you prefer social and recreational programs planned for the 
entire group of nonacademic employees (1500 in Urbana—800 in Chicago) 
or by separate departments? 27% entire University; 44% by depart- 
ments; 29% no answer. 

23. What recreation would you like to have organized for employees? 
15% softball; 10% basketball; 39% bowling; 19% other; 17% no answer. 

24. Would you like to have more information on the Credit Union? 
38% no; 9% uncertain; 30% yes; 23% no answer. 

25. Did the last Civil Service Examination which you took strike you 
as a fair test of your knowledge and ability for the position to which it 
referred? 6% unfair; 35% reasonably fair; 25% very fair; 34% no 
answer. 

26. What suggestions do you have for improving our examination 
procedure? Many helpful suggestions were offered. Most of them sug- 
gesting revision of the present examinations for specific classifications. 

27. Are you in favor of a 40 hour week? 81% yes; 6% uncertain; 
7% no; 6% no answer. 








466 Gerald Carter 


28. When you were employed by the University what was your first 
impression of your new employer? 32% excellent; 42% good; 12% fair; 
3% poor; 0% very poor; 11% no answer. 

29. Which of the factors listed below was the most important in es- 
tablishing this impression? 24% interviewer; 10% surroundings of the 
first interview; 45% the person under whom you work; 25% your fellow 
workers; 20% working conditions; 6% others. : 

30. What change, if any has there been in this impression? Most 
employees indicated no change. However, a few had experienced changes 
for the worse and a still smaller number signified that their impressions 
had been improved. 

31. This Attitude Survey is being sponsored by the Council in an 
effort to find out what you, its constituents, think and want. What 
would you suggest as worthwhile projects for the Council to undertake? 
Further investigation into working conditions in each department; work 
on earlier retirement age for women; more harmony; evening classes with 
credit; a better health program; better medical care; talk over employees’ 
problems; more attitude surveys; occasional meetings for all employees; 
more representative council; fairer classifications based on ability and 
education; try to iron out some of the minor troubles between employees 
and administration; more rest rooms; bi-monthly pay period; arrange- 
ment of lunch hours; salary adjustment; recreation program; have voice 
in interpretation of policy; adequate ventilating systems; better pro- 
promotion system. 

32. Do you believe such Committees or Councils can accomplish 
any useful results? 3% no; 28% uncertain; 41% yes; 28% no answer. 

33. Do you appreciate this opportunity to fill out this questionnaire 
and to express your ideas and feelings? 3% I think it is silly; 44% I 
think it is a rather good idea to express these things; 6% I doa’t know; 
37% I think it is an excellent thing to be able to fill out this question- 
naire; 10% no answer. 


Conclusions 


1. Employees need more information about the retirement system and 
disability benefits. 

2. The majority of responding employees prefer University Civil 
Service to State Civil Service. 

3. The training program should be expanded. 

4. Many employees desire University credit for evening courses. 

5. Bowling is the favorite recreation. 

6. The University compares favorably with other local employers as 
to working conditions. 
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7. The majority of responding employees feel that their opportunities 
for advancement are poor or only fair. 

8. The majority of the returned questionnaires favor a 40 hour week. 

9. The Civil Service Employee Council is believed to be worthwhile. 

10. The majority of replies favored recreational programs organized 
on a departmental basis. 

11. Most of the questionnaires returned favor a retirement age earlier 
than the present age of 68, even if it is necessary for them to pay more into 
the retirement system. 

12. The majority of replies favored the present policy of paying the 
prevailing rate paid in the local community for similar work. 


The data were further analyzed in order to study the effect of such 
factors as age, sex, length of service and type of work. Some of the most 
significant results of this analysis are listed below: 


1. Women employees wish to retire at an earlier age than do men. 

2. Young employees wish to retire at an earlier age than do older 
workers. 

3. Recently hired employees wish to retire at an earlier age than do 
employees with many years of service. 

4. The provisions as to disability benefits were understood by a larger 
per cent of women than men. 

5. Workings conditions were understood by women better than men. 

6. Men believed the average University employee to be more supe- 
rior to others in the community than did women. 

7. Women showed a stronger preference for the University as com- 
pared to other local employers than did the men. 

8. Women showed stronger agreement with the policy of paying the 
prevailing rate in the community than did the men. 

9. Men believed the wage and salary policies to be fairer than did the 
women. 

10. Men showed a stronger preference for University Civil Service 
over State Civil Service than did the women. 

11. Men believed the present examinations to be fairer than did the 
women. 


Many constructive and useful suggestions and comments in addition 
to the checked answers were written on the returned questionnaires. 
These are now being studied carefully and many of them will result in 
improvements in procedure or policy. Some of the changes already put 
into effect or soon to be made are listed below: 


1. Many of the present Civil Service examinations are being com- 
pletely rewritten. 
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2. Great effort is being made to promote present employees rather 
than to hire new ones at the upper levels of various classifications. _ 

3. The present salary structure is being carefully studied. Inequal- 
ities are being corrected in the light of a job-evaluation. 

4. Changes in the retirement system have been recommended. 

5. Plans are being made for expanding the training program. 

6. Information concerning the retirement system and disability bene- 
fits is being distributed. 


Information concerning the attitudes of University employees to- 
ward the retirement system, general working conditions and University 
customs and policies have been extremely helpful to the Director of Non- 
academic Personnel. The results of this survey will also be very help- 
ful in determining the future action and policies.of the Civil Service Em- 
ployee Council. In addition, the project provided a splendid opportunity 
for many employees to “‘air’”’ small grievances that had perturbed them, in 
some instances, for many years. 


Received September 20, 1946. 














A Biological-Pharmaceutical Checker Selection Program 


Howard Maher and Isabelle E. Fife 
Sharp and Dohme, Inc., Philadelphia, Pa. 


One of the most important jobs in a drug manufacturing company is 
the Checker’s job. Although systematic checks are made by the profes- 
sionally trained personnel in the earlier stages of manufacture, in the 
final stages of drug filling, labeling, and packaging, the Checker’s mis- 
takes can cause a great loss of reputation and sales. This paper con- 
cerns the results of a program undertaken for the purpose of constructing 
a checker selection test battery. Four nationally-standardized tests 
and two tests constructed specifically for this purpose were employed. 


The Cases Studied 


The Checker population in this company consisted of 45 employees, 
40 of whom cooperated in the study. One of the cases withdrew before 
the study was completed. 

Thirty-eight of the Checkers were females. Ages of the Checkers 
ranged from 25 years to 57 years with a mean of 39 years and a standard 
deviation of 8.84 years. Number of years of education ranged from 4 
through 12 years (mean = 9 years; SD = 1.87 years). Length of serv- 
ice in the Checker classification ranged from 3 months to 32 months, 
with a mean of 23 months and a standard deviation of 10.36 months. 
With the exception of the one Checker having 3 months experience as a 
Checker (who had had 12 months of part-time Checker experience work- 
ing out of the Finisher classification) the range of experience was from 6 
months to 32 months. 

Total length of service with the company ranged from 8 months to 413 
months with a mean of 153 months and a standard deviation of 115.74 
months. All Checkers had had some previous experience as Finishers. 


Description of the Job 


Five classifications constitute the Checker Job Family. These clas- 
sifications are Biological Label Checking, Pharmaceutical Label Check- 
ing, Biological Vaccine Checking, Biological Packaging Checking, and 
Pharmaceutical Packaging Checking. Job analyses have shown that the 
majority of duties within each of the classifications does not show any 

469 











470 Howard Maher and Isabelle E. Fife 


great variation among classifications. The duties common to all Checker 
classifications are: 


1. The checking of labels, cartons, and circulars against work tickets for 
correctness of product, product numbers, amount of the product, and dosage 
strengths. 

2. The checking of products for quality and appearance (liquids for foreign 
substances; packages, bottles, and vials for obliqueness of labels, etc.). 

3. The preparation of final-stage control records for the Shipping Depart- 
ment. The records cover number of units produced by the Fiaishers, product 
name, and control number. 


. The final-stage comparison of labels on market containers with work 
tickets. 


Checkers have some short cycle responsibilities which are not commen- 
surate with the total job pattern; e.g., the final check for correct label- 
ing and packaging, as well as the responsibility for halting work operations 
where judgment indicates that quality is not up to par are examples of 
this shortcycle heavy responsibility. 

Observation of the Checker’s job reveals that sustained attention for 
long periods of time as well as distribution of attention are important 
job requirements. For example, labels on which a number of items are 
to be checked including a dosage strength of 1 mg. of a chemical, are 
sometimes followed without interruption by a flow of labels on which the 
only difference is a dosage strength of .1 mg. of the same chemical. 

The requirement of visualization of spatial relations is seen in the 
judgment of acceptable or non-acceptable obliqueness of labels on bot- 
tles, vials or packages. Manual agility factors (eye-hand coordination 
and finger dexterity) exist in the rapid handling and inspection of labels 
and market containers. 

Production incentive rates are not used for Checkers since the prime 
responsibility is one of accuracy. However, since most Checkers pass 
on the work of Finishers who are paced by machines, the pace becomes 
quite rapid for short periods of time. 


Description of the Tests 


By means of the foregoing job analysis, the following tests were de- 
cided upon: 


1. The Thurstone Test of Mental Alertness. 

2. Test BPC-1. This is a high-level attention test designed to meet the 
distribution of attention the job demands as mentioned above. In construct- 
ing the test 1600 single digit numbers were arranged in random order by means 
of the Peatman and Schafer Tables of Random Numbers (9). The subject’s 
task is to “cross out every 2 and draw a circle around every 3 until you come 
to a 7; then you are to draw a circle around every 2 and cross out every 3 until 
you come to the next 7.” The subject must reverse the crossing-encircling 
procedure each time he encounters a 7. The time limit on the test is set at 
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10 minutes. Since accuracy is stressed on Checker’s jobs, accuracy is empha- 
sized in the test directions. Scoring is in terms of the number of errors com- 
mitted, disregarding the reversal process (i.e. only the omission of 2’s and 3’s 
is counted in the final error score). 

3. Test BPC-2. This is a test designed to measure the job demand of 
attention under simple or constant mental set. It consists of 1600 upper-case 
letters from A to J arranged randomly by means of the Peatman and Schafer 
tables. Under a time limit of 5 minutes, the subject is required to cross out 
all A’s and E’s encountered. Error scoring is in terms of A’s and E’s missed. 

4. The Minnesota Clerical Test. 

5. The MacQuarrie Tests for Mechanical Ability. 

6. The Pennsylvania Bi-Manual Worksample. 


The Criterion 


Examination of department records of Checker performance revealed 
that they were merely “‘spot-check” records, unsystematic and not sub- 
ject to reliable quantification. Rating scales were therefore designed 
around specific Checker job requirements and submitted to the assistant 
department managers in charge of Checker operations. This linear scale 
rating, arbitrarily weighted with accuracy items, was scored on a 20 point 
basis. Table 1 shows the means and standard deviations of the ratings 
by each of 4 assistant department managers, the immediate supervisors 
of the Checkers attached to their various departments. 


Table 1 


Means and Standard Deviations of Ratings Assigned to Checkers by 
Assistant Department Managers 











No. of 
Checkers 
Rater Rated Mn. 8.D. 
A 4 10.9 5.00 
B 5 7.7 3.87 
Cc 22 11.6 2.89 
D 9 11.0 3.19 
40 





It can be seen from Table 1 that severity-leniency of ratings as well 
as the spread of ratings differed too .nuch for the direct combination of 
Managers’ ratings. Accordingly, the ratings by each manager were con- 
verted into ranks and then into normalized per cent scores using the table 
by Hull (7, 8). This 100 point scale was reduced to a 10 point scale and 
scores were rounded to the nearest whole number. The resulting dis- 
tribution of criterion scores on the 40 Checkers had a mean of 5.1 and a 
standard deviation of 1.88. 
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Reliability of the criterion was determined by a re-rating 10 weeks 
after the first rating. The original 40 cases were used on both rating and 
re-rating. Ratings were again transmuted. The mean re-rating score 
was 5.0 with a standard deviation of 1.88. The Pearson r between the 
first and second transmuted criterion scores was .86. Rank-Difference 
Rho’s between rating and re-rating transmuted criterion scores for the 
four department managers were .81, .88, .94, and .95 for N’s of 9, 5, 22, 
and 4 respectively. 

No external criterion against which the ratings could be validated 
was available. However, the alignment of Checkers on the criterion 
scores did show substantial agreement with the qualitative “spot-check”’ 
accuracy records of the four departments involved. 

The question as to whether it was necessary to control the criterion 
for amount of experience and age was handled by correlating various 
biographical data with the transmuted criterion scores. Table 2 reveals 
no significant relationship between these variables and success or failure 
as a Checker.! By reason of these coefficients and by inspection of scat- 
ter-diagrams it was concluded that, since the data showed neither cur- 
vilinearity nor relationship to the criterion, it was not necessary to ad- 
just the criterion. 

Table 2 


Means and Standard Deviations of Biographical Items and the Correlation of 
Biographical Items with the Criterion 











Variable Mn. 8.D. r 
1. Age (in years) 38.9 8.84 —.17 
2. Education (in years completed) 9.1 1.87 25 
3. Checker Service (in months) 23.2 10.36 AT 
4. Total Service with the Company (in months) 153.3 115.74 ll 





Table 3 presents the validities of the various tests in the battery. 
The Thurstone scores, the two cancellation tests, and the MacQuarrie 
Tracing, Copying, Blocks, Pursuit, and Total reach the 5% level of 
significance. The Pennsylvania Bi-Manual Worksample has the lowest 
validity of any of the tests. 

It was hypothesized that the low validity of the Minnesota Clerical 
Test might be due to the fact. that the usual scoring of this test (R-W) 
weights speed more heavily than it does accuracy. The test authors 
state, ‘In view of the fact that the test is primarily a speed test with 
accuracy as a negligible factor, it would seem wise to continue to use the 


1The Wallace-Snedecor tables (5) require an r of .304 for significance at the 5% 
level and an r of .393 for significance at the 1% level, with 40 degrees of freedom (degrees 
of freedom for our study = 37). 
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Table 3 
Correlation Between Tests and the Criterion 

Name of Test Mn. 8.D. Te 

1. Thurstone Mental Alertness “L’’ Score 19.7 12.67 .30 

2. Thurstone Mental Alertness “Q” Score 12.7 8.85 .38 

3. Thurstone Mental Alertness ““G”’ Score 32.4 20.65 35 

4. BPC-1 (No. of Errors) 4.8 5.46 —.39 

5. BPC-2 (No. of Errors) 8.4 8.69 — .36 

6. Minnesota Clerical—Number Checking 86.8 24.60 13 

7. Minnesota Clerical—Name Checking 79.9 29.56 .26 

8. Penna. Bi-Manual Assembly (Time in Sec’s) 308.8 45.23 — .03 

9. Penna. Bi-Manual Disassembly (Time in Sec’s) 159.9 25.51 — .06 

10. MacQuarrie Tracing 29.3 8.68 34 
11. MacQuarrie Tapping 36.6 7.46 10 
12. MacQuarrie Dotting 18.4 3.69 .26 
13. MacQuarrie Copying 19.9 13.47 34 
14. MacQuarrie Location 17.6 8.58 .23 
15. MacQuarrie Blocks 6.7 5.45 33 
16. MacQuarrie Pursuit 15.0 7.97 35 
17. MacQuarrie Total 47.8 13.76 39 





R-W scoring formula in order to penalize those subjects who happen to 
be unusually (italics ours) careless or inaccurate” (1). Similarly, Cope- 
land (2) has presented r’s between number of items attempted and the 
R-W score of .96 and .89 for Numbers and Names respectively. 

Accordingly, the test was re-scored for number of errors. The result- 
ing validities for Numbers and Names respectively were —.21 and —.09. 
Thus the Numbers validity was raised slightly, but the Names validity 
was lowered ‘substantially. The data in Table 4 show no substantial 
relationship among scores on the cancellation tests BPC-1 and BPC-2 
and the Minne. ta Clerical Test. 











Table 4 
Intercorrelations of BPC-1, BPC-2, and the Minnesota Clerical Test 
Test 2 3 4 
1. BPC-1 (No. Errors) .39 —.27 —.32 
2. BPC-2 (No. Errors) —.27 —.12 
3. Minn. Names (R-W) 71 
4. Minn. Numbers (R-W) ’ 





It would appear that the MacQuarrie tests showing validity in this 
study are concerned with spatial measurement except the Tracing Test 
which did not structure into the Harrell S factor (6). 











474 Howard Maher and Isabelle E. Fife 


The data were subjected to Wherry Doolittle analysis in order to find 
the battery yielding maximized shrunken multiple prediction (10). The 
order of entry of the tests into the shrunken multiple is shown in Table 
5. The maximized shrunken multiple correlation coefficient obtained 
was .48. 











Table 5 
Tests in Their Order of Entry into thé Shrunken Multiple 
Correlation Coefficient 
Test R 
BPC-1 (Accuracy)* 39 
Thurstone Mental Alertness Q Score 45 
MacQuarrrie Copying AT 
BPC-2 (Accuracy)* 48 





* Correlation and intercorrelation signs were reversed for an accuracy interpretation 
prior to Wherry Doolittle Analysis. 


Raw scores on these four tests were summed for each individual and 
and transmuted into the 10 point scale used with the criterion ratings. 
The zero-order r between the transmuted test battery scores and the 
transmuted criterion scores was found to be .57. The mean transmuted 
test battery score was 5.0 with a standard deviation of 1.91. 


Previous Studies 


To the authors’ knowledge, only one study reported in the literature 
is directly concerned with the problem of the selection of drug checkers. 
Ghiselli (3, 4) has reported the results of testing 26 inspector-packers 
whose duties very closely resemble the duties of the subjects of this experi- 
ment. Ghiselli’s inspector-packers however had some finishing require- 
ments (filling vials, bottles, and capsules) and, as a result, the jobs prob- 
ably had greater speed and dexterity requirements. 

Using a combination of a forelady’s and the finishing room super- 
visor’s ratings (between-rater reliability =.72), Ghiselli obtained the 
following validities shown in Table 6. 

Ghiselli obtained an R of .72 between four tests (Minnesota Paper 
Form Board, Pegboard, Copying, and Dotting) and his criterion. Copy- 
ing, evidently measuring a spatial factor, enters into the Wherry Doolittle 
in Ghiselli’s case as it does in‘ours. Furthermore, the amount of validity 
of several of Ghiselli’s tests is quite close to the validities of the same tests 
in our study (Table 6). In general, spatial and dexterity-speed tests 
have more predictive value in the Ghiselli battery; accuracy of attention 
and spatial tests have more predictive value in our battery. 


= 
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Table 6 


Validities of Tests in the Ghiselli Battery for the Selection of Inspector-Packers and, 
in Parentheses, the Validities of Coinciding Tests in our Battery 











Te Te 
Test Ghiselli This Study 

1. Minnesota Paper Form Board 57 

2. Pegboard* —.50 

3. Minnesota Placing —.24 

4. Minnesota Turning —.40 

5. Minnesota Number Checking .29 (.13) 
6. Minnesota Name Checking .26 (.26) 
7. MacQuarrie Tracing 09 (.34) 
8. MacQuarrie Tapping .24 (.10) 
9. MacQuarrie Dotting .28 (.26) 
10. MacQuarrie Copying —.11 (.34) 
11. MacQuarrie Location 07 (.23) 
12. MacQuarrie Blocks 24 (.33) 
13. MacQuarrie Pursuit .22 (.35) 
14. MacQuarrie Total 19 (.39) 





* Not nationally standardized, the test required insertion of 100 pegs, ;"’ in diam- 
eter, 1” long into 100 holes. 


Summary 


1. Thirty-nine Checkers in a Pharmaceutical-Biological manufactur- 
ing company were tested on 17 tests. 

2. The criterion employed was a transmuted rating by managers. 
The reliability of this criterion, obtained by a 10 week interval re-rating, 
was .86. The ratings showed substantial agreement with qualitative 
“‘spot-check” department records. 

3. Two cancellation tests, the Thurstone Mental Alertness Q score, 
and the MacQuarrie Copying subtest, gave a maximized shrunken pre- 
diction of the criterion of R = .48. The zero-order correlation between 
transmuted test battery and criterion scores was found to be r = .57. 

4. A previous study on drug finisher-checkers has found more value 
in dexterity-speed tests. 

Received July 18, 1947. 
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Factors Related to the Proficiency of Motor Coach Operators 


Clarence W. Brown and Edwin E. Ghiselli 
University of California 


During the war years traction companies as well as other kinds of in- 
dustries sought personnel wherever they could be found and whatever 
their caliber. The range of individual differences of personnel applying 
for war time jobs in many cases appears to have been considerably greater 
than the range of peace time applicants. Not only were many persons 
hired who were below what in peace times was considered marginal ability 
for industrial jobs but also many were employed who were at higher levels 
educationally and intellectually than the usual run of workers. In the 
traction company, one of whose personnel problems is the concern of this 
paper, the range of individual differences in general intellectual ability 
and amount of education was nearly twice as great among war time appli- 
cants as among those seeking employment in the pre-war years. This 
increased range of individual differences permits an evaluation of selective 
devices under optimal conditions. Under these circumstances we pro- 
pose here to evaluate four factors which are weighted heavily by certain 
transit companies in the selection of motor coa¢h operators, namely, in- 
telligence test score, age, amount of education, and marital status. 


Basic Data 


The data to be reported here are the records of a sample of men who 
applied for the job of motor coach operator in a city transit system during 
the years 1943 and 1944. For all of these persons scores were available 
on a sixty item intelligence test that was routinely administered to all ap- 
plicants. In addition to intelligence test scores, for men who actually 
reported for work, records of age, years of schooling, and marital status 
were at hand. 

Under ordinary circumstances in the selection of motor coach oper- 
ators a score of 40 on the test was set as a critical score for inexperienced 
persons but for those with a particularly favorable background a score of 
30 was accepted. Although the test was administered to applicants 
during the war years these minimums were not adhered to. In addition 
to the minimum intelligence test scores the attempt also was made to hire 
men in their twenties and thirties, who were high school graduates, and 


who were married. 
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Of the 363 applicants considered here 247, or 68% reported for duty. 
The remaining 116 persons were “no-shows,” that is, they were processed 
by the employment office and were hired but never appeared for duty. 
Since employment processesing is of considerable expense, involving a 
medical examination and requiring three to five hours time on the part 
of the management and the union, the problem of no-shows is of some 
importance. No-show, therefore, was used as one criterion measure. 

The two main criterion measures used are length of time on the job and 
accident rate. For the 247 men who were hired and actually drove buses 
records were available for 20 months of employment. Some of the men 
remained on the job for as short a period as one month and only 15% re- 
mained on the job longer than 20 months. The other 85% either left 
voluntarily or were dismissed for inadequate performance. Cases of 
termination due to draft into the armed services were eliminated. This 
very high labor turnover is not characteristic of normal times although 
turnover is sufficiently large to be a problem. 

For each operator accident rate was calculated as the number of ac- 
cidents in which he was involved divided by the number of months he was 
employed and on duty. The relationship between this accident rate and 
months of experience on the job appeared to be linear and was of the order 
of —.13. Although this relationship is iow the accident rate for each man 
was corrected by the experience factor. It was impossible to divide the 
accident records into two equal periods in order to obtain an estimate of 
reliability. However, the accidents were broken down into two cate- 
gories, collision and non-collision accidents, and the coefficient of correla- 
tion between these two types of accidents was found to be .25. This 
coefficient is suggestive of at least a moderate degree of reliability of ac- 
cident records. Collision accidents were about three times as numerous 
as non-collision accidents. 


Results 


The biserial coefficient of correlation between the no-show criterion 
and the intelligence test was .23. This coefficient while not high assumes 
some significance when considered in light of the critical scores set for the 
test. Whereas 32% of the entire group were no-shows, 42% of those 
scoring below 40 and 51% of those scoring below 30 were of this sort. 

In Table 1 are given the coefficients of correlation among the intel- 
ligence tests scores, years of schooling, and age at time of hire, and the 
coefficients of correlation between these variables and accident rate (cor- 
rected for experience) and months remaining on the job. With the pos- 
sible exception of the correlation between age and months on the job 
none of these coefficients show any promise for the prediction of profi- 
ciency either in terms of accident rate or time on the job. 
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Table 1 
Intercorrelations Among Intelligence Test Scores, Years of Schooling, Age at Time of 
Hire, Accident Rate per Month (Corrected for Experience), and Time 
on the Job, of 247 Mator Coach Operators 








Accident Months 





Rate on Standard 
Schooling Age per Mo. Job Mean Deviation 
Intelligence 
test score 33 —.07 —.05 09 37.20 10.40 
Years of 
schooling —.19 —.01 04 9.96 1.96 
Age at time 
of hire —.01 21 28.36 6.24 





The relationships between marital status and accident rate and time 
on the job are shown in Table 2. On both measures of proficiency the 
married operators are slightly superior to the single ones. The number 
of divorced operators is small, but they too appear to be more proficient 
than the single operators. 


Table 2 


Marital Status in Relatlon to Accident Rate per Month (Corrected for Experience) 
and Time on the Job of 247 Motor Coach Operators 




















Accident Rate Time on Job 
Marital No. of Standard Standard 
Status Cases Mean Deviation Mean Deviation 
Single 52 .98 1.09 6.64 6.70 
Divorced 13 .64 15 7.18 6.10 
Married 182 71 .60 8.00 7.00 
Conclusions 


The common standards used in hiring motor coach operators, namely, 
that the desired employee should score high on an intelligence test, be 
young, with considerable education, and married, do not appear to have 
much validity. Married operators are slightly superior to single oper- 
ators both in terms of accident rate and time remaining on the job, and 
the older operators tend to stay on the job for longer periods of time than 
the younger ones, but none of the other relationships deviates appreciably 
from zero. Since the validities of the individual predictors are so low it 
is is not likely that any weighted combination of them would be partic- 
ularly helpful in the prediction of success of motor coach operators. 


Received October 7, 1946. 








Distributions of Test Scores of Industrial 
Employees and Applicants 


Harold F. Rothe 
Stevenson, Jordan & Harrison, Inc., Chicago, IUinois 


Tests that are used for selecting and placing personnel are frequently 
validated for particular jobs by being tested against the performance of 
employees on those jobs. Those tests found valid are incorporated into 
the interviewing procedure with appropriate critical scores. As a gen- 
eral rule there is an understanding among the employer and employees 
that no one’s job status will suffer as a consequence of his test results and 
that the taking of the tests for validation purposes is on a voluntary basis. 

It is highly essential under this procedure, that the critical scores thus 
established be considered tentative and that a follow-up analysis of ap- 
plicants’ scores be made. The reason for this is that a different dis- 
tribution sometimes occurs when the tests are subsequently given to ap- 
plicants. For example, a test may be validated against the existing em- 
ployee force and a critical score may be set that would have eliminated 
from a certain job the bottom 15% of those employees. The test is later 
administered to applicants using that critical score but it is found that 
practically all applicants pass the test. A higher critical score must be 
set. As a general rule a critical score that would have eliminated about 
40% of the employees from a job will disqualify about 15% of the ap- 
plicants, for that job. Some examples of such a shift in test score dis- 
tributions are shown in Table 1. 

The Purdue Adaptability test data and the Code Identification test 
data are both from a commercial laundry plant in a medium sized city. 
The Code Identification test is a test for laundry employees based upon 
the principle of immediate recognition. The score on this test is the 
total of right minus wrong answers on four triais. 

The Spatial Relations test requires the testees to change a figure into 
a square by moving oneline. The Line Assembly test is a multiple choice 
test in which the testee indicates which of four alternative shapes can be 
made from a given set of lines. The two distributions for the Spatial 
Relations test are shown in Figure 1. 

Several possible reasons have been advanced to explain these shifts in 
distributions of applicants over employees’ test scores. The present 
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Table 1 
Test Scores of Employees and Applicants 
Test Group Mean 8.D. C.R. N 
Purdue Adaptability Employees 9.16 5.16 1.52 55 
Applicants 10.14 4.53 130 
Code Identification Employees 27.82 15.33 2.97 56 
Applicants 34.90 13.85 129 
Spatial Relations Employees 7.77 3.76 7.38 188 
Applicants 11.20 5.51 216 
Line Assembly Employees 20.60 5.49 12.90 191 
Applicants 28.22 5.88 189 
a ee | Employees 
52 ali - Applicants 
& ‘4 te : 
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Fic. 1. Distribution of raw scores of employees and applicants 
on the spatial relations test. 


data permit a test of the plausability of these reasons and hence they are 
discussed below. 


1. One reason that has been advanced is that applicants are younger 
than employees and young people get higher test scores than older per- 
sons. The Code Identification and Line Assembly tests, however, are 
not significantly related to age and hence the youth of the applicants 
does not fully explain their shifts. 

2. A second reason is that many applicants are test-broken as a re- 
sult of military experience. In the laundry treated here all applicants 
and employees were female with practically no veterans included in these 
groups. This same shift has been found in several other plants employing 
women and there, too, very few applicants were service women. 

3. A third possibility for these shifts is that the applicants are apply- 
ing for different jobs and data have been lumped for varying groups of 








482 Harold F. Rothe 


people thus making the two distributions non-comparable. One psychol- 
ogist told the writer of his having observed this phenomenon in one plant 
where office and shop employees were both given the same general adapt- 
ability test. An increase in office hiring, with the attendant higher scores 
of office personnel, made such a shift. In the present instances all data 
refer to shop applicants only. 

4. Some employment men have suggested that this shift is because 
“the word gets around” that a given company is using tests and therefore 
a “better class” of people apply for jobs there. Some laundries have 
found this shifting of scores and have attributed this shift to the “word 
getting around the colored section of the city.””’ This suggestion is wrong 
on logical grounds and the data below indicate that this reason is incor- 
rect in practice. This suggested reason involves the assumptions that 
(1) people who cannot pass the tests recognize and admit this fact; (2) 
the city is large enough to provide a large labor supply and all pontential 
job applicants no longer apply (i.e., a self-imposed selection ratio is opera- 
tive) and (3) the tests are valid—since “better” people apply. There 
is no particular justification for any of these assumptions. 


The data in Table 1 on the Spatial Relations and Line Assembly tests 
are from a large machine shop in a small town. This machine shop is the 
dominant industry in this town and the surrounding farm area. Since 
the installation of employment tests in this shop there has been no de- 
crease in the number of applicants. Hence it appears that there is no 
question of the “word getting around” in this small town, and still the 
shift of scores is found. These data serve to some degree as a control 
against the possibility of unqualified applicants ceasing to apply for jobs. 
This furthe emphasizes the fact that tests should be validated and not 
merely installed. The mere installation of a test battery does not auto- 
matically raise the level of applicants to a higher level of ability for the 
specific jobs in question. 

5. The fifth possibility for this shift is that job applicants are more 
highly incentivated than are employees already on the job. This appears 
to be the most likely explanation of this phenomenon of shifting test scores. 
A somewhat similar phenomenon among college students has been re- 
ported by Tiffin, Knight and Josey' and Seashore.? The present data 
suffer from a lack of complete controls but of the five possible explanations 
this fifth one is most supported by the available data. One might reason- 
ably assume that a person applying for a job will do his best on a quali- 


1 Tiffin, J., Knight, F. B., and Josey, C.C. The psychology of normal people. Bos- 
ton: D. C. Heath & Co., 1940, p. 87. 

* Seashore, H. G. The improvement of performance on the Minnesota Rate of 
Manipulation Test. J. appl. Psychol., 1947, 31, 254-259. 
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fying test whereas an employee taking these tests as a favor to manage- 
ment in some vague future situation—i.e., to validate tests that will be 
given to other, future people—has no such incentive. 


Conclusion 


In conclusion, the present data indicate the need for a follow-up anal- 
ysis of test scores when tests are validated against the existing employee 
force. A critical score established on the basis of employees’ scores may 
be entirely too low when applied to applicants. They further indicate 
that tests must be validated in each situation. The mere presence of em- 
ployment tests does not insure the selection of properly qualified per- 
sonnel. The tests must be validated for the jobs in question. The 
greater test-taking incentivation of applicants appears to account for 
the shifted distribution of their scores as compared with the distribution 
of employees’ scores. 


Received June 25, 1947. 
Early publication. 





Output Rates among Machine Operators: I. Distributions 
and Their Reliability 


Harold F. Rothe 
Stevenson, Jordan & Harrison, Inc., Chicago, Illinois 


The most difficult problem confronting industrial psychologists to- 
day is probably that of finding an adequate criterion against which to 
measure the effect of some variable that is introduced. It is far easier 
to make a reliable employment test than it is to validate that test. It 
is easy to convince oneself of the value of rest periods, music, foreman 
selection, a training program, etc., but it is very difficult to prove that 
value. Production figures are commonly considered to be the most de- 
sirable criterion, but very little is actually known about this criterion. 
How are these output rates distributed, and are they consistent from time 
to time? What generalizations can we make about production figures as 
a criterion? The present papers report one study of output rates as a 
criterion. 

Nature of the Data 

The data for this study were taken from the official books of the 
Standards Department of the Four Wheel Drive Auto Company, located 
at Clintonville, Wisconsin. They cover the period from December 10, 
1945 through January 20, 1946. They refer to regular employees in the 
machine shop of that firm. The men were working at their regular jobs 
and at their usual workplaces. It should be recalled that this was a 
period of acute shortages in materials for this and most other plants. 
This prevalence of shortages is an uncontrolled variable insofar as com- 
paring these with other data. On the other hand this kind of a situation 
(“not normal’’) represents the realistic condition that constantly exists 
in industrial work and the data are therefore more, rather than less, valu- 
able since they reflect industry in operation. 

The operators in this shop were on standards but not on incentives. 
That is, a standard hours work had been determined by time studies but 
the men were paid on an hourly rate basis. The existence of these stand- 
ards makes possible a combining of the outputs of men doing different 

1 Grateful acknowledgment is made to Robert A. Olen, General Manager of the 
Four Wheel Drive Auto Company, for permission to publish these data, and to G. F. 


DeCoursin, Manager of the Standards Department, for aid in collecting and analyzing 


these data. 
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jobs. It is unfair to compare the number of gears one man makes with 
the number of pieces another man blanks out on a punch press, but we 
can compare the relative efficiency of performance of these two men if a 
standard of performance has been established foreach job. Standard 
production of the operators was calculated daily and summarized bi-weekly 
for each operator and recorded in the Standards Department. Three of 
these bi-weekly sets of figures were used in this study. These were for 
the periods ending in December 23, 1945, January 6, 1946, and January 
20, 1946. This time includes the Christmas and New Year’s Day holi- 
days, with whatever influence these occasions might have exerted. 


Distribution of the Data 


Frequency distributions of the output rates for 130 men are shown in 
Figure 1. These men were the first 130 men in the record books who 
worked 30 or more hours on standard jobs during all three of the two week 
periods. All of these distributions approximate the normal distribution. 

The ranges from the greatest to the least production in each distri- 
bution are 30-80, 17-88, and 15-86, respectively. The ratio of the best 
to the worst producer in each period is therefore 2.7:1, 5.2:1, and 5.7:1, 
respectively. These ratios are somewhat higher than those previously 
reported by Evans (1) (4:1, and 3:1), by Hull (2) (1.4:1), by Stead and 
Shartle (5) (4:1, 3:1, and 1.5:1), by Tiffin (6) (2.5:1), and by Rothe (4) 
(1.73:1). 

It appeared possible that this rather large ratio might have been at- 
tributable to the relatively small number of hours worked on standard 
jobs by some of the men during each of these periods, the hypothesis being 
that the lower output rates were made by men who worked too few hours 
on standard jobs to become warmed up to those jobs. To control this 
possibility another sample was selected of 130 men who worked less than 
30 hours on standard jobs in any one period. Each of these men is shown 
only once in Figure 2 although this distribution covers data from all three 
periods. The range in Figure 2 is 10-80, the ratio is 8-1 and the distri- 
bution is again roughly normal in appearance. (The variability of this 
distribution, measured by the interquartile spread, is slightly greater 
than the variability of the distributions in Figure 1.) It is concluded 
from these two figures that the ratios of the ranges are enormous regard- 
less of the number of hours spent on standard jobs, within the limits of 
the present data. 

The enormousness of these ranges has great significance from both 
an incentive and a selective point of view. If it is assumed that all of 
these men were highly incentivated to turn out high production then the 
value of psychological tests to select only employees with the proper 
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aptitudes is apparent. On the other hand, if it is assumed that the men 
had roughly equal aptitudes for the work there is an obvious need for a 
more effective method of incentivating“them.?:* 
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Fic. 1. Output rates of 130 men for three successive two-week periods. 


*The Four Wheel Drive Auto Company has subsequently taken action on both 
these possibilities by installing employment tests and by installing a joint labor-manage- 
ment incentive system. 

* The term “incentivation” is used here in preference to the more common “motiva- 
tion” because the former makes clearer the fact that we are dealing primarily with 
incentives, rather than motives in this situation. 
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Fie. 2. Output rates of 130 men who worked less than 30 hours on standard jobs 
in any two-week period. 


Intercorrelations between Periods 


Correlation coefficients have been obtained between the three periods. 
The correlation between periods 1 and 2 is .57, between periods ! and 3 
it is .72, and between periods 2 and 3 it is .68. The correlation between 
periods 1 and 2 (December 23, 1945 and January 6, 1946) is shown graph- 
ically in Figure 3. 

These correlations are fairly low when compared with the correlation 
of .96 between two successive one-week periods reported by Tiffin (6). 
This difference may have been caused by differences in working condi- 
tions, including the effect of material shortages mentioned above. It is 
more probable, however, that the lower correlations reported here reflect 
a lower level of incentivation among the employees. Tiffin’s hosiery 
loopers were paid on an incentive wage system while these machine oper- 
ators were on a daily rate. 

In previous papers, the writer reported a low intercorrelation of daily 
work curves for butter wrappers (3) and also a large range of output rates 
for each individual operator (4). The hypothesis was presented that 
“the incentives to work may be considered ineffective when the ratio of 
the range of intra-individual differences is greater than the ratio of the 
range of inter-individual differences.” In the present instance the data 
are available only in group form for two week intervals and hence a com- 
parison with the daily work of the butter wrappers is not possible. How- 
ever, it is likely that a similar hypothesis is supported by these data. 
The large range of intra-individual output of butter wrappers presum- 
ably means a large standard deviation of their output rates. This would 
result in a lower intercorrelation for each operator from period to period 
than would be true if the range and standard deviation were small. In 
the present instance, using group data, a lower intercorrelation has been 
found here for machine operations than by Tiffin for hosiery loopers. 
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These results are therefore consistent with the results on butter-wrappers. 
This permits an addition to the hypothesis previously proposed, namely 
that if the intercorrelation of group output rates for two periods closely re- 
lated in time is less than .80 the incentivation is not highly effective, while 
intercorrelation higher than .90 indicates effective incentivation. 


so} 
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Scattergram of output rates for 130 men for period 
12-23-45 vs. 1-6-46 (r = .57). 


These results are important because of their relation to the validation 
of tests or other variables. They indicate a possible objective measure 
of the effectiveness of incentives and this measure is a more practical one 
than that previously proposed by the writer. Finally, these results in- 
dicate the necessity for completely describing the existing conditions, 
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particularly with reference to pay and other incentives, in reporting in- 
dustrial studies. 


Received July 3, 1947. 
Early publication. 
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Age and Strength* 


M. Bruce Fisher 
Fresno State College, California 
and 


James E. Birren 
Gerontology Unit, U. S. Public Health Service; Baltimore City Hospital 


The decline of bodily strength during the latter part of life has been 
a matter of sporadic scientific interest since Quetelet’s pioneering study 
(13) more than a century ago. Some of the more recent data on the prob- 
lem have been reviewed by Todd (17). Such information is of value to 
those in the fields of military and industrial personnel, since the character- 
istics of the older worker are becoming more important, and it may also 
be germane to developmental psychology and to the growing medical 
specialty of gerontology. 

In a previous paper (3), a procedure for the use of the hand dyna- 
mometer as a measure of hand strength was described, which has satis- 
factory reliability and which showed some indication of usefulness as an 
index of response to physiological stress. This paper will show the re- 
lation of some scores collected by this procedure to the age of the subjects 
tested, and will make some comparisons with earlier work concerning 
age and strength. 


Procedure and Subjects 


The dynamometer test procedure (3) requires the subject to squeeze 
the hand dynamometer at three second intervals, beginning with a squeeze 
of 27 kg. (18 kg. for women) and increasing the force exerted each time 
by an increment of 3 kg. until he is unable to achieve the required in- 
crease in level of performance. The score for one hand is the kilogram 
reading of the last try and the score for a test is usually taken as the mean 
score of the two hands. Calibrated Smedley hand dynamometers were 
used. 


* This report was prepared when the writers were on active duty in the U. S. Naval 
Reserve at the Naval Medical Research Institute, National Naval Medical Center, 
Bethesda, Maryland. The opinions expressed are those of the writers and are not to be 
construed as reflecting the policies of the Navy Department. 
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Age and dynamometer data are available on six groups of subjects 
(Table 1). The group of subjects used to determine the reliability of 
the procedure was made up, for the most part, of subjects measured prior 
to serving in various experiments conducted at the Naval Medical Re- 
search Institute; others were medical department personnel. The Camp 
Lejeune group was measured in connection with a study carried on at 
the Medical Field Research Laboratory, Camp Lejeune, N. C. (11). 
The Waves comprised three complete companies of students in a Hospital 
Corps School (WR). The data of the three industrial groups were col- 
lected at two TNT plants by Past Assistant Sanitarian (R) Robert B. 


Table 1 
Age of Subjects 











Centiles 
Group ‘ 10th 90th 





Naval personnel 
Reliability group: Medical 
Department enlisted men 
and officers, and seamen 3 23.0 , 18.1 35.6 
Camp Lejeune group: Hos- 
pital corpsmen under in- 
struction 21.8 20.5 4.6 18.7 27.8 
Waves: Students in a Hos- 
pital Corps School 22.7 22.1 3.3 20.4 278 
Industrial personnel: Manual 
workers in TNT plants 
Plant 1: men 313 32.5 31.7 7.5 25.0 42.8 
Plant 2: men 239 36.9 34.8 9.5 27.1 50.4 
Plant 2: women 96 32.4 31.4 8.4 22.9 444 





Malmo, USPHS, Industrial Hygiene Research Laboratory, National 
Institute of Health, who supplied the original data for analysis and in- 
clusion in this report. The naval personnel were tested on both hands; 
the industrial groups, on the preferred hand only. 


Results 


Correlation coefficients between dynamometer score and age, height, 
and weight have been calculated separately for the six groups of subjects 
(Table 2), and show some consistency. In the case of the correlation 
with age, the correlation ratio, eta (4), becomes a more valid measure of 
the relationship thanr. There is a maximum in the curve for each sample 
when mean dynamometer score is plotted against age group and the dis- 
tribution of ages with respect to this maximum can determine both the 
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size and the sign of the product-moment coefficient. Some significant 
differences among the r’s in the first row of coefficients in Table 2 are to 
be explained on this basis. 




















Table 2 
Coefficients of Correlation between Dynamometer Score and Age, Height and Weight 
Naval Personnel Industrial Workers 
Variable 
Correlated Relia- Camp 
with bility Lejeune Plant 1 Plant 2 Plant 2 
Dynamometer Group Group Waves Men Men Women 
Score N=72 N=90 N=161 N=313 N =239 N =98 
Age (r) .32 —.40 .00 —.16 —.38 —.20** 
(eta) 36 .24* 21 .20 A2 _— 
Height (r) .28 .22 35 .25 .26 
Weight (r) 54 50 40 34 .16** 





All coefficients are significantly different from zero (P < 1%) except .00 and: 
* P = 2-5%; **, P > 5%. 


Table 3 
Dynamometer Score and Age in 552 Male Manual Workers in Industry 








Age 18-22 23-27 28-32 33-37 3842 4347 48-52 53-68 





N 28 82 153 126 72 42 29 20 
Mean score 53.46 56.05 54.23 53.33 52.42 50.36 48.31 46.80 
(kg.) 
o 5.80 6.93 6.63 6.96 6.15 7.18 6.52 5.88 





The data of the two groups of male industrial workers were combined 
for treatment by analysis of variance (Table 3; Figure 1, curve 17). A 
significant relation was found between age and dynamometer score as 
shown in the following tabulation: 








Source Sum of Squares d.f. Variance 
Between age groups 303.12 7 43.30 
Within age groups 2723.23 544 5.01 
Total 3026.35 551 





F = 43.30/5.01 = 8.64 (d.f. for F = 7 and 544). 


With these degrees of freedom an F of 5.67 would be significant at 
the 1% level of confidence. The data of the two male naval samples 
were similarly combined and analyzed (Figure 1, curve 18). In this 
group, which had a more restricted age range than the industrial workers, 
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the F-ratio was not quite significant at the 5% level. ta’s, computed 
on the same sets of combined data, were 0.30 for the industrial men, and 
0.24 for the naval men, values which are, respectively, more than seven 
times and more than three times their standard errors. 


AGE 


(1) Quetelet, men (1635) Bock 










N*? 
(2) . . Meon two hands 
N*? 
(3) . i) Both hands 
N*? 
(4) * women Bock 
N*? 
(5) * “ Mean two hands 
N*? 
© -« ° Both hands 
N=? 
(7) Galton,men (1884) Stronger e 
N=70I5 E 
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N-6998 
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N= 6985 ¥ 
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N=2510 3 
(it) . + Wrist extension m 
N#1728 H 
(12) . Wrist flexion = 
. N*I728 
(13) . . Hand 
N=3923 
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N« 327 
17) industriat . Pref 
en men tisady” J Nose oun 
18) Navol Mean hands 
‘ on (oan Naeé 


10 20 30 40 SO 60 70 80 
AGE 


Fic. 1. Relationship of strength to age. Values are plotted as per cent of the 
maximum. Each curve is drawn to a different baseline, separated by 20% from the 
next. Sources: curves 1-6, 13; 7-9, 14; 10-15, 18; 16, 1. 


Discussion 


Statistical Considerations. Curves relating strength and age derived 
from other data (13, 14, 18, 1) have been plotted with two gradients found 
in this study (Figure 1). The close parallel among these curves is to be 
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noted, especially in view of the fact that they represent measurements 
made over a period of more than 100 years by different persons with many 
different pieces of apparatus. It is clear that the development of mus- 
cular strength follows a systematic trend with an increase in strength up 
to the late twenties and a decline, usually at an increasing rate, from that 
time on. These curves are not to be taken as indicating the exact re- 
lationship between age and strength, for we have no assurance of the 
comparability of the samples at the various ages. Especially is this true 
with respect to the comparability of the upper and lower age levels. 
Quetelet (13) presented this strength data only as averages for certain 
age ranges and he remarked that some of these averages were based on 
as few as ten cases. He gives no description of his subjects but we can 
surmise from other parts of his treatise that most of them were middle or 
upper class individuals. Galton’s data, which have been analyzed by 
Ruger and Stoessiger (14), were collected at a health exposition. Al- 
though his numbers were large, his subjects probably had some bias, 
especially the older groups, since the persons measured were interested 
in attending a health exposition, were able to walk around examining 
exhibits, and were willing to submit to a battery of 15 measurements. 

The same sort of limitation, failure to measure as many of the least 
physically fit of the upper as of the lower age ranges, probably applies 
to the data of Ufland (18) who measured several thousand wage-earners in 
a variety of occupations. 

Similar caution should be used in interpreting the data of this report. 
The decrease in score with age in Table 3 would probably be greater if 
the samples at the upper age levels were not selected with respect to lon- 
gevity and the ability to hold a regular job. Absenteeism, which may have 
been affected by the individual work-load in the TNT plants in which 
these data were gathered, may also have influenced the data for the upper 
age levels. Although it has been found in one light industry that the ab- 
sentee rate was no higher among the older than among the younger 
workers (15), in the TNT plants studied the work-load may have been 
such that the older men took sick leave or days off more frequently. 
‘If this were true, then another bias may exist in the data favoring healthy, 
physically fit, and well-motivated old men who were working on the days 
when the tests were administered. 

If it be assumed that a bias does exist in the data for older groups and 
that the tendency is for more of those with poorer scores to be missing 
from the group than of those with better scores, one deduction can be 
made which is capable of statistical evaluation. It is simply that if sub- 
jects in their prime make up a “normal” distribution, then those in the 
older groups should show a more restricted distribution with a positive 
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skew. This should occur, hypothetically, because the weaker members of 
the older group are less likely to be measured in a testing program than are 
such members of the younger group, although most of the stronger older 
persons are likely to be measured. 

Data to test this hypothesis (Table 4) were taken from Ruger and 
Stoessiger’s tabulation of Galton’s data for the grip of the stronger hand 
(14, Table 6), shown in curve 7 of Figure 1 The moments of the distri- 
bution were calculated for one group, including ages 29, 30 and 31, which 
are at the maximum of those authors’ spline-drawn curve of the regression 


Table 4 


Comparison of the Distribution of Strength Scores (Grip of Strouger Hand) at the 
Maximum and in the Seventh Decade. Data from Galton (14) 














Age Group 29-31 61-69 
N 530 146 
Mean 84.10 lbs. 74.25 Ibs. 
Mode* 83.85 Ibs. 71.75 lbs. 
o 10.85 9.44** 
x (skewness) 0.024 0.265** 
ox 0.053 0.101 
ty 0.45 (P > 60%) 2.61 (P = 1%) 





* Where Mode = Mean — co, (12). 
** Corrected for difference in means of yearly age samples. 


of strength on age. The measure of skewness, chi (12), was 0.024 (Table 
4) which indicates no departure from a “normal” or Gaussian curve. 
Another sample of men in the seventh decade from the same table was 
similarly treated. There were at least 11 men at each age and the total 
N was 146. The chi of the older group’s distribution, +0.265, was 2.6 
times its standard error (Table 4) and indicates a significant degree of 
skewness; a value as large would be expected to occur by chance, in the 
positive direction, only once in 200 samples. The difference between the 
chi’s of the two distributions was significant at the 5% level of confidence 
and that between the two sigmas also was significant at the 5% level. 
To the extent indicated by these two P-values, the original deduction is 
verified. The difference between the calculated modes is 2.25 pounds 
(dynamometer pull) more than that between the means and this value 
becomes the minimum estimate of the amount of bias in the difference 
between the means which has resulted from selection at the older ages. 

Relation of Strength to Other Aspects of Ageing. An increase in strength 
during childhood and adolescence may be assumed to be a result of mat- 
uration of the organism. During this period environmental influences 
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are slight in comparison with changes due to growth. . The decline of 
strength in the latter part of life is also to be explained in large part on an 
organic basis (2, 17), but environmental factors may also be significant. 
In many occupations increasing age may be accompanied by advance- 
ment to administrative or supervisory, and usually more sedentary posi- 
tions, with resulting loss of strength through disuse. Avocational partici- 
pations also show a trend, during the adult years, in the direction ‘of less 
active recreations and hobbies (6). 

Decrease in exercise may be responsible for some of the fall in the 
curves of Quetelet and Galton (Figure 1), but changes in neither occupa- 
tional nor avocational exercise can account satisfactorily for the decline 
in strength scores found in Ufland’s study or in the industrial group of 
Table 3. The subjects of these studies were all actively engaged in em- 
ployments which were approximately the same regardless of age, and 
therefore occupational exercise may be considered constant. In view of 
the fact that almost all these occupations involved considerable physical 
activity, avocational exercise could be the source of little additional 
variability. 

Unfortunately, there are no ) data which indicate the importance of 
motivation or even the usual direction of its effect on strength scores, in 
the upper age groups. It is not unlikely that motivation can influence 
scores in both directions depending on the social setting of the testing 
and the proneness of the individual to display physical prowess. 

The findings on increase and decline of strength scores are in general 
agreement with those reported for intellectual abilities (16, 7, 5, 10) and 
motor skills (8, 9). The wide variability in any age group and the over- 
lapping between age groups which are shown in Table 3 and which can be 
inferred from the size of the coefficients of Table 2, are also in harmony 
with the findings on these other abilities. The importance of the nervous 
system to all these sorts of behavior lends support to the supposition 
that their declining efficiencies in old age have similar causes. 


Summary 


1. Measurements of hand strength on a group of 552 male manual 
industrial workers showed maximum strength in the middle twenties 
with a continuous decline thereafter. At age 60 the decline in average 
strength amounted to 9.25 kg., or 16.5% from the maximum. There was 
considerable overlapping among age groups. 

2. These findings are in agreement with other data on several measures 
of muscular strength which show that strength increases up to the middle 
or late twenties and declines continuously thereafter. In most studies 
the rate of decline increases with age. 
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3. Sources of error in “cross-sectional” studies and the relation of 
strength to some other aspects of ageing are discussed. 


Received October 18, 1946. 
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A Five Month Strength Curve 


James D. Weinland 
New York University 


Do people have long term swings in variability, that are controlled 
mainly by some internal mechanism? In order to test this hypothesis, 
called Hersey Variability, in some slight degree, the writer measured 
his strength every day for a five month period. Some interesting data 
were uncovered also on a long term improvement in strength in response 
to the relatively small amount of exercise involved in taking the test. 

Rexford B. Hersey' published his study of workmen’s emotions, in 
book form, in 1932. He reported recurrent emotional fluctuations in 
male workers, that went from high to high, or low to low, in from three 
to nine weeks. Ease or difficulty of work accompanied these fluctuations. 
Environmental causes apparently hurried or delayed the up and down 
swings slightly but the fluctuations themselves seemed to be internally 
controlled. 

Various conclusions have been drawn from this study. Workmen’s 
vacations might be given them in the low periods; individuals with similar 
cyclical periods could be teamed; work: might be planned so that the 
heavier or more difficult jobs were arranged in periods of up-swing. De- 
cisions, it is said, should be avoided during the low periods, and everyone 
is reminded that no matter how miserable he may feel, he can have the 
assurance that he will feel better before long. It seemed that these con- 
clusions should be further tested before recommendations became too 


widely scattered through the literature, and so the present study was 
undertaken.’ 


Procedure 


Measurements were made by squeezing the Smedley hand-dynomom- 
eter as hard as possible with the left hand, then the right, then the left 


1 Hersey, R. B. Workers’ emotions in shop and home; a study of individual. workers 
from the psychological and physiological standpoint. Philadelphia: University of Penn- 
sylvania Press, 1932. 

2 Reference is here made to: Hersey, R. B. Cycles in workers’ efforts and emotions. 
Engineers and Engineering, 1929, 162-166; Hersey, R. B. A monotonous job in an 
emotional crisis. Person. J., 1930, 9, 290-296; and Brentlinger, W. H. The emotional 
stability of the transient. J. appl. Psychol., 1936, 20, 193-207. 
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and finally again with the right hand. Four measures were taken, 
rather than one, to increase the reliability. The average of these four 
megsures was considered the day’s strength. The tests were taken at the 
seme time each day, immediately on arising in the morning and before 
breakfast. 

In order to determine the approximate variability due to the instru- 
ment a control curve was constructed. The control measurements were 
made in precisely the same manner as those described above, beginning 
at 3 P.M. on one day only and repeated every ten minutes for ten trials, 
till fatigue began to be an influence. The averaged measures were: 63- 
61-61-62-62-63-62-62-62-61 kilograms. The standard deviation of 
the control curve was .63. The standard deviation of the daily strength 
curve was 2.9. It seems probable that most of the variability in the five 
months’ strength curve, some two standard deviations, was due to factors 
other than those inherent in the instrument used. 


Results 


The strength measurements in kilograms, with the days on which 
they were taken, numbered consecutively from the first day, are given 
in Table 1 below, and in Figure 1. 


Discussion 


In graphing the results there appeared to be a gradual increase in 
strength continuing throughout the whole term. To determine this more 
precisely the line was smoothed with a line of least squares, and the 
general increase was found to run with an increment determined by the 
two points; 56.39 kilograms on the first day to 60.08 on the last day. The 
author assumes this increase to be due primarily to increasing strength. 
He had had considerable previous practice with the Smedley dynomom- 
eter, having used it previously in other experiments and for demonstra- 
tions. This continuing increase in strength seems remarkable when con- 
sidered in relation to the amount of exercise. Two intense squeezes with 
each hand each day, led to a strength increase still continuing at the 
end of five months. If such increases can be generally obtained, as 
cheaply, by men over forty-five, it might be an important fact in personal 
hygiene. 

Examination of the strength curve around the line of least squares in- 
dicates a long range variability that, in general, resembles Hersey Varia- 
bility. By general inspection we find major low areas as follows: On the 
first day and again on the 48th to the 64th day with a 48 day interval, 
and again after another 48 day interval on the 112th day. High points 
occurred from the 25th to the 34th day; again after a 46 day interval, 
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Table 1 


Average Dynamometer Strength in Kilograms Each Day for Five Months 
Note: Mean = 58.23; 8.D. = 2.9 ; 





January February March April May 


(19) 58 (47) 53 (78) 59 (108) 59 (139) 61 
(20) 56 (48) 52 (79) 57 (109) 58 (140) 65 
(21) 60 (49) 58 (80) 60 (110) 60 (141) 65 
(22) 59 (50) 58 (81) 59 (111) 57 (142) 64 
(23) 56 (51) 54 (82) 58 (112) 55 (143) 61 
(24) 58 (52) 54 (83) 60 (113) 61 (144) 63 
(25) 61 (53) 56 (84) 56 (114) 57 (145) 63 
(26) 58 (54) 53 (85) 56 (115) 60 (146) 64 
(27) 57 (55) 54 (86) 60 (116) 61 (147) 64 
(28) 57 (56) 57 (87) 57 (117) 61 (148) 62 
(29) 56 (57) 54 (88) 60 (118) 61 (149) 62 
(30) 58 (58) 54 (89) 57 (119) 56 (150) .62 
(31) 58 (59) 56 (90) 58 (120) 57 (151) 61 

(1) 54 (32) 57 (60) 58 (91) 60 (121) 60 (152) 61 

(2) 54 (33) 58 (61) 60 (92) 59 (122) 59 

(3) 54 (34) 61 (62) 58 (93) 58 (123) 59 

(4) 53 (35) 60 (63) 56 (94) 60 (124) 60 

(5) 55 (36) 58 (64) 51 (95) 60 (125) 66 

(6) 54 (37) 57 (65) 56 (96) 58 (126) 57 

(7) 57 (38) 57 (66) 56 (97) 60 (127) 59 

(8) 54 (39) 58 (67) 56 (98) 58 (128) 59 

(9) 55 (40) 57 (68) 57 (99) 58 (129) 59 

(10) 57 (41) 56 (69) 57 (100) 59 (130) 57 

(11) 55 (42) 59 (70) 60 (101) 58 (131) 58 

(12) 54 (43) 56 (71) 58 (102) 60 (132) 61 

(13) 56 (44) 58 (72) 57 (103) 58 (133) 60 

(14) 58 (45) 54 (73) 56 (104) 59 (134) 60 

(15) 56 (46) 57 (74) 59 (105) 57 (135) 61 

(16) 59 (75) 58 (106) 58 (136) 59 

(17) 59 (76) 57 (107) 58 (137) 61 

(18) 58 (77) 60 (138) 61 
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from the 80th to the 97th day, and once more after 43 days on the 140th 
day. These intervals are taken from the graph by inspection. The 
major swings in strength are quite obvious, though slight differences in 
intervals might be suggested by other techniques of observation. 

The author had anticipated that he would secure evidence against 
Hersey Variability, believing that environmental factors would be more 
influential. The results did not support his anticipations. 

The curve also indicates a small, day to day, variability in strength. 
The amount of work done, food eaten, sleep, and other incidental re- 
sponses are probably responsible for these changes. 
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Fic. 1. Five months strength curve. 


Summary 


1. Hersey variability is supported by this study, limited by the fact 
that all the data refer to one individual. 

2. A slow increase in hand strength continued over the whole five 
month period as a consequence of the daily test exercise. 

3. A small, day to day, variability in hand strength was indicated. 


Received September 24, 1946. 





The MacQuarrie Test for Mechanical Ability: 
Ill. Follow-up Study 


Charles H. Goodman 
Radio Corporation of America 


This article presents the findings of a follow-up study made five 
months after the last of the 329 female subjects had been tested and hired.! 
This follow-up study was conducted for the purpose of determining what 
had happened to the group of 329 female operators in the factory. 


Data 


The data used in this follow-up study were obtained from the per- 
sonnel records of the 329 subjects. Of the information thus obtained, 
most valuable were the data secured from the termination records.? 


These records contain the following information: 1. Number of months on 
the job; 2: Number of times individual was hired; 3. Reason for termination; 
4. Foreman’s rating on the following factors: a) Quantity—extent to which 
employee keeps up with production schedule; 6) Quality—based on overall 
percentage of rejects; c) Initiative—ability of employee to carry on without 
extra help or instruction; d) Personality—ability to get along with co-workers 
and supervisors; ¢) Punctuality—on time at starting whistle, after lunch, and 
after rest periods; f) Attendance—percenta e of absence during total time 
spent on job; g) Work attitude—employee’s desire to do a good job—how well 
r likes job, as reflected in quality produced—how weil she takes instruction 
and supervision; h) Adaptability—how quickly worker adjusts to change in 

rocess, changes in position in line, versatility, flexibility; 7) Conduct—general 

havior at work—amount of rowdiness, trouble making, gossiping, etc.; 
5. Whether employee is “‘S’’ satisfactory or “U” unsatisfactory, based on 
foreman’s indications as to whether employee made a determined effort to do 
a good job, and whether he would rehire employee in his section; 6. Whether 
foreman recommends employee be piaced elsewhere if employee wishes to 
return to company; 7. Foreman’s indication of type or job for which employee 
is best suited; 8. Rater’s signature; and 9. Pertinent remarks. 


There are, of course, the usual criticisms that are generally levelled at 
rating scales, which can be readily applied to the rating scale used on the 
termination report. Some of the foremen admitted that they gave little 


1 Goodman, Charles H. The MacQuarrie Test for mechanical ability, I. Selection 
of radio assembly operators. J. appl. Psychol., 1946, 30, 586-595. 

2 A termination record is completed for each employee leaving the company. The 
personnel officer sends this report to the employee’s foreman who must complete this 
form before the employee is released. 
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thought to their rating scale reports and stated they checked off the 
columns quickly in order to get rid of the form. The Personnel Office, 
in order to overcome this indifference on the part of the foremen, added 
items 5, 6, 7, 8 and 9 to the Termination Record with the thought that 


some check would be had in terms of the consistency of the ratings with 
these items.* 


Terminations 


Upon completion of the study of the personnel jackets it was found 
that 193 of the 329 subjects had terminated their employment with the 
company. This figure represents a turnover of 58% during the ten 
month period from November 1943 to August 1944. What were the 
reasons for these terminations? The reasons given by the employee to 
the personnel officer at the time of the exit interview have been grouped 
into major headings and are presented in Table 1. 


Table 1 
Company and Employee Reasons for Terminating Jobs 








Per Cent 
of Total 
Termi- 
Employee Reasons nations Company Reasons 





Family Probler s 31.6 For Cause 
Personal Problems 15.5 Lack of Work 
Leaving City 15.0 

Leaving City to Join Husband 10.9 

Other Work 3.1 

Enter Armed Services 3.1 

Returning to School 2.5 

Lack of Transportation 5 


158 82.0 35 18.0 





Table 1 has been divided into two parts: the first part shows the rea- 
sons given by the employees who scught their own termination; the 
second part shows the reasons given by the company for discharging this 
group of employees. Of the 193 cases who terminated, 158 or 82% of the 
total terminations were sought by the employees themselves. Thirty- 
five employees or 18% of the total terminations were effected by the 
company. Of the company terminations 29 or 15% were for ‘‘cause,”’ 
while six employees or 3% of the total terminations were due to the ces- 


sation of a special project. The groupings in Table 1 are given in broad 


+ In view of the limitations of the rating data and the manner in which some of them 
were completed, extreme caution has been exercised in interpreting this material. 





504 Charles H. Goodman 


categories. Finer breakdowns have been made of these broad categories, 
and are shown in Tables 2 and 3. 

An interesting factor concerning the terminees was disclosed from the 
termination records when Item 5 was examined. Item 5 requires the 
foreman to check the termination record as to whether the employee was 
“S” satisfactory or “U”’ unsatisfactory, based on his judgment of whether 
the employee made a determined effort to do a good job. It was found 
that of this group of 158 cases who had requested their own release, 28 
were marked unsatisfactory with the recommendation that these in- 


Table 2 
Breakdown of Family Reasons for Terminating Jobs 





Per Cent 
of Total 


Family Reasons Terminations 





Child Care 

Family Illness 

Family Reason 

Family Responsibilities 
Pregnancy 

Family Duties 
Husband Objects 


17.1 
3.6 
3.1 
3.1 
2.6 
1.0 
1.0 





Table 3 
Breakdown of Company Reasons for, Terminating Jobs 





Per Cent 
of Total 
Company Reason Terminations 





Unable to do Work 6.2 
Poor Attendance and Absences 4.7 
Poor Attitudes 3.1 
Physical Disabilities 1.0 
Lack of Special Work ~ 3.1 





dividuals should not be rehired if they sought to return to the company. 
In all probability these individuals would have been terminated by the 
company under normal conditions. The shortage of labor during the 
war was probably the factor that saved them from being discharged by the 
company. This finding places a different light on the interpretation of 
Table 1. Actually, instead of the 15% indicated as being terminated by 
the company for “cause,” there were 28 additional subjects who were 
unsatisfactory workers. The total per cent of all unsatisfactory em- 
ployees was in reality 29.5% of all those who terminated. 
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Tenure 


Having examined the reasons given by the company and the em- 
ployees for these terminations, it was decided to examine the 
company tenure of the individuals who had terminated.‘ A tabulation 
of the number of days worked by the terminees was macle and disclosed 
the following facts. Fifty-two subjects or 26.9% of all those who left 
the company, worked only 30 days. Twenty per cent of the employees 
terminated after 60 days, and this downward trend in numbers is main- 
tained as the time period increases. At the time the tabulation was made, 
it was found that only three employees or 1.6% of the terminees had 
worked for as long a period as 210 days. The average length of working 
time by the 193 terminees was 122 days; however, this figure is misleading 
because of the positive skew of the distribution. It was found that 78.1% 
of the terminations occurred before the mean time of 122 days had elapsed, 
while 21.9% of the terminees remained beyond the mean time. The 
sigma of the distribution is 49.65 days. The curve itself is definitely of 


an inverse J type, and shows the rapid decline in numbers as the time in- 
terval increases. 


Selection Efficiency 


In the first article of this series, correlations were mace to determine 


the efficiency of the MacQuarrie Test in predicting Vestibule Training 
Ratings. The correlations obtained were found to be low in size and 
offered little possibility for prediction. The question was raised in terms 
of the possibility of selecting future employees by establishing a critical 
score based on the actual working experience of these subjects. Table 
4 shows the mean MacQuarrie Test scores for each of the groups used in 
an attempt to establish a critical score. 


Table 4 
Mean and Sigma MacQuarrie Test Scores of Groups Used Comparatively 








Group N Mean e 





All Subjects 329 46.8 12.0 
Still Employed 136 43.8 11.7 
Unsatisfactory Workers 28 48.4 11.9 
All Terminees 193 47.7 12.3 
Discharged for Cause 29 40.3 11.8 





‘It will be recalled that the subjects of this study were hired over a five month 
period, and that this study was begun five months after the last person had been hired. 
Hence, it was possible that an individual hired at the beginning of the period could 
have had a maximum employment of ten months, whereas an employee hired at the 
end of the period could only have worked a maximum of five months. 











506 Charles H. Goodman 


Critical ratios taken among these groups in order to determine the 
significance of the differences between the mean test scores revealed the 
following. The critical ratio between the group of 193 terminating and 
the group of 136 who remained on the job was 2.99, favoring the group 
that terminated. Here the chances are 99.9 in 100 that the difference 
is significant. The people who left the company made better test scores 
than did those who remained on the job. 

The group discharged by the company, when compared with those 
who remained on the job, showed a critical ratio of 1.41 in favor of those 
who remained on the job. The chances are 92 in 100 that those who re- 
mained on the job tended to make better test scores than those who were 
discharged. 

The difference between the group that terminated and the group 
discharged by the company was found to be 3.01. The standard error 
of the difference indicates virtual certainty that those who terminated 
made better test scores. 

Finally, the comparison between the group discharged by the com- 
pany with the group described by the company as being “‘unsatisfactory”’ 
showed a critical ratio of 2.53 favoring those who were considered un- 
satisfactory. The chances are 99.4 in 100 that the difference is signifi- 
cant. Superficially, it would appear that by taking the MacQuarrie 
scores ranging from 43.78 to 47.73, a critical score could be established. 
This would on the surface appear to eliminate those who were discharged 
by the company, the unsatisfactory group, and the group that terminated. 
However, a frequency table of the scores of the 136 subjects who remained 
on the job showed that only 14 persons had scores between the range of 
43.78 and 47.73. Sixty scored below 43.78 and the remaining 62 made 
scores above 47.73. The possibility of establishing a critical score on the 
basis of this evidence does not appear to be encouraging. 


Unsatisfactory Workers Versus Company Dischargees 


The finding that the standard error of the difference between the mean 
test score of the “‘unsatisfactory group” and the group discharged by the 
company was 2.53 in favor of the former group, led to a comparison of the 
reasons why the “unsatisfactory group” had so been regarded. It was 
found that of the 28 unsatisfactory workers, 14 of them, or 50.0% of this 
group, were considered unsatisfactory by the foremen for reasons which 
can be broadly considered as due to personality factors. Some of the 
reasons given by the foremen which were classified as personality factors 
are as follows: too excitable; chronic complainer; not dependable; lacking 
in responsibility; indifferent to work; and poor attitude. Four cases or 
14.5% of this “unsatisfactory group” can be considered as due to physical 
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factors. Under this heading such statements appeared as: stiff fingers; 
too slow; too old. The reasons given by the foremen for seven cases or 
25.0% of this “unsatisfactory group” can be considered as due to factors 
causing a lack of ability. The remaining three cases or 10.5% of this 
“unsatisfactory group’? were so considered by the foremen because of 
poor attendance and tardiness. 

No attempt was made to determine whether the mean MacQuarrie 
test scores of the “‘unsatisfactory group” was significantly different from 
the group “discharged”’ by the company, since numbers in each of the 
categories were so small. As a matter of interest the differences in terms 


of per cent on each of these categories are shown for these two groups in 
Table 5. 


Table 5 
Comparison of Reasons Given fer Unsatisfactory Ratings and Discharges 

















Company Discharges Unsatisfactory Ratings 

Per Cent Per Cent 

Reason N of Group N of Group 
Unable to do Work 12 41.4 7 25.0 
Poor Personality 9 31.0 14 50.0 
Poor Attendance 6 20.7 3 10.5 
Physical Disability 2 6.9 4 14.5 





It appears from Table 5 that of the group “discharged’”’ by the com- 
pany the largest percentage was due to a lack of ability on the part of 
individuals to do the work. sn the other hand the largest percentage 
of the group considered “‘unsatisfactory” workers by the company were so 
described because of personality and attitudonal factors. It appears 
that there is a definite need for the use of some personality measures or 
interviewing technique to help screen out these individuals. 


Taylor-Russell Selection Ratio 


In a previous article’ mention was made of the plan to use the Taylor- 
Russell selection ratio for future hiring. The plan was based on the 
following conditions: first, the largest R obtained by combining all of the 
sub-tests of the MacQuarrie with Vestibule Training rating was .46; 
secondly, it was the opinion of the plant superintendent that 50% of the 
employees hired prior to testing were generally satisfactory. It was 
therefore agreed that in future hiring only those individuals who made 
MacQuarrie Test scores that placed them in the top 30% of the distri- 


5 Op. cit. 
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bution of those being considered should be hired. The Taylor-Russell 
tadles indicate that under these conditions 71% of those selected and 
hired should be satisfactory workers. 

The acute manpower needs which arose as a result of the war pre- 
vented this plan from being placed into operation. It was decided, how- 
ever, to apply these conditions to the original group of subjects. While 
this is in retrospect, it was thought it might prove of interest. Had the 
top 30% of the 329 subjects been hired, only 98 persons would have been 
put to work. All of these 98 persons would have had a MacQuarrie 
Test score of 55 or above. What happened to this group of 98 subjects 
who were among the 329 hired? A more meaningful comparison of what 
happened to the group can be had by comparing them with the 231 sub- 
jects who scored below 55 and were also hired by the company. Table 
6 shows the number of subjects of both groups who still remained on the 
job and the number of subjects of both groups who had terminated. The 
critical ratio of the difference of the percentage remaining on the job and 
terminating is 1.36 between these two groups. It favors those who scored 
below 55 on the test. While the difference is not one of virtual certainty, 
the chances are 91 in 100 that more of the group scoring below 55 tended 
to stay with the job. 


Table 6 


Comparison of Terminations and Continued Employment 
Based on Test Score Groupings 

















Group Scoring 55 and Above Group Scoring Below 55 

Per Cent Per Cent 

N of Group «/Dp o/Dp N of Group 
Still Employed 35 35.7 1.4 101 43.7 
Terminated 63 64.3 1.4 130 56.3 
98 100.0 231 100.0 





A further comparative breakdown of the data between these two 
groups was made in terms of the number of persons discharged by the 
company. Here the standard error of the difference is 3.04 in favor of 
the group scoring above 55; that is, the group scoring above 55 had a sta- 
tistically significant lower rate of company discharge. A second com- 
parison was based upon the number of unsatisfactory workers found in 
each group. Here the critical ratio was .14 in favor of the group who 
scored below 55 has no significance. A third comparison made was of 
the’number of subjects who, although they had left the company, were 
considered to be satisfactory workers and would be rehired by the com- 
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pany if they sought to return. Here the standard error of the difference 
was 2.52 in favor of the group who scored above 55. The finding in- 
dicates that the chances are 99.4 in 100 that there were more satisfactory 
workers among the group who scored above 55 and terminated. The 
final comparison was made in terms of the total number of satisfactory 
workers for each group; that is, the workers who remained on the job and 
were doing satisfactory work as well as those who terminated but were 
considered satisfactory. The critical ratio was 1.68 indicating that the 
chances are 95 in 100 that there were more satisfactory workers among 
the group of 98 subjects who scored above 55 than there were in the 
group of 231 subjects who scored below 55. 


Table 7 


Comparison of Ratings Among Terminated Subjects 
Based on Test Score Groupings 




















Group Scoring 55 and Above - Group Scoring Below 55 
Per Cent Per Cent 
N of Group «/Dp o/Dp N of Group 
Satisfactory 51 52.0 2.5 86 37.2 
Unsatisfactory 9 9.2 14 18 7.8 
Discharged 3 3.1 3.1 26 11.3 
63 64.3 130 56.3 
Total Satisfactory 
Working and Ter- 
minated 86 87.7 1.7 187 80.8 





According to the expectations of the Taylor-Russell ratio, 71% or 
72 subjects of the group of 98 who made MacQuarrie scores of 55 and 
above should have been satisfactory workers. The results of this study 
study show that 87% of this group were considered satisfactory workers. 
On the other hand, it was also found that 80% of the subjects who scored 
below 55 on the MacQuarrie were found to have been satisfactory workers. 
The group scoring above 55 have a superior record in terms of fewer 
company discharges. On the other hand, there is a greater tendancy 
for the group scoring below 55 to remain on the job. To the industrialist, 
tenure or longevity is an important factor in the successful operation of 
his plant. 

These comparisons appear to support the earlier evidence of this paper 
which showed that the MacQuarrie Test was not a good discriminatory 
device in selecting workers for the radio assembly work being done in this 
plant. 
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Summary and Conclusions 


Five months after the last of the 329 subjects had been tested and 
hired, a follow-up study was made to determine what had happened to 
this group in the factory. It was found that 193 or 58% of the original 
group had terminated their empooyment. Of this number 35 were dis- 
charged by thecompany. The largest single cause for termination among 
those who had done so of their own volition was due to “family reasons.” 
The major cause for company discharge was due to “inability to do the 
work assigned.” 

It was further brought to light that there was among the 158 who 
sought their own release a group of 28 workers whom the company con- 
sidered “unsatisfactory.” The cause for this dissatisfaction was mainly 
due to personality problems. 

A study of the tenure of the 193 terminees revealed that the largest 
percentage of terminations occurred within the first 30 days of employ- 
ment. The mean tenure of the group who terminated was 122 days, but 
this mean is highly affected by the skew of the curve. 

An attempt to establish a critical score on the MacQuarrie Test for 
future selection purposes was unsuccessful and again showed the lack of 
discriminatory powers of the test in this situation. 

A study was made to determine if the percentage of satisfactory 
workers obtained as a result of using the ". aylor-Russell ratio was signifi- 
cantly different from the percentage of satisfactory workers found among 
the workers who would not have been hired had the Taylor-Russell ratio 
been used. The results showed that while there were more satisfactory 
workers among the group who would have been hired under the Taylor- 
Russell ratio conditions, the difference was not one of virtual certainty. 
This again appears to point up the lack of discriminatory power of the 
MacQuarrie Test in this situation. 


Received November 29, 1946. 








A Comparison of Earlier and Later Success in 
Naval Aviation Training* 


Ralph D. Norman 
Princeton University, Princeton, N. J. 


A basic problem in any military training program is the attempt to 
predict as early as possible future success in that program. Aviation 
training particularly, being subject to great expenditure of time and 
monetary expense, merited careful study by psychologists in the recent 
war and both the Army and Navy developed tests to predict, as early as 
possible, future success in aviation activities (1, 2). The Naval Aviation 
Psychology program, for example, to which the writer was attached during 
1943-44, gave two ratings to each prospective naval aviation cadet. 
These were a Flight Aptitude Rating (FAR) and one based on achieve- 
ment in the Aviation Classification Test (ACT). The former was in- 
tended to offer some measure of prognosis of the success of the potential 
pilot in fiying per se, whereas the latter was useful in predicting success 
in the more academic, or ground school, phase of training. 

Another problem related to this one of initial prediction of success 
on the basis of selected tests evaluated against later criteria is the re- 
lationship of early success in the training program to later achievement. 
This paper is concerned with this question as studied in the naval pilot 
training program at one of the Naval Aviation training units during the 
war. 

The Naval Aviation Training program in 1943 was organized in suc- 
cessive stages, as follows: The Naval Flight Preparatory Schools (NFPS) 
where stress was laid on ground school subjects and physical conditioning; 
the War Training Service (WTS) schools where the cadet learned to fiy 
simple aircraft such as Cubs or Taylorcraft, and continued some ground 
school work; the Naval Pre-Flight Schools (NPFS) where again ground 
school study and physical conditioning was emphasized; the Naval Air 
Stations (NAS) where the major stress was once more on flying, this time 
of primary trainer aircraft. There were even later stages of training 


* This research was conducted at the U. 8. Naval Flight Preparatory School, Mon- 
mouth College, Monmouth, Illinois in 1944. Opinions expressed in this paper are 
personal views of the author, and are not to be regarded as expressions for the Navy 
Department. 
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involving advanced military flying at the air bases at Pensacola and 
Corpus Christi, as well as carrier flying, but these latter phases are not 
the concern of this study. 

The writer, attached as psychological officer to the Naval Flight Pre- 
paratory School at Monmouth College, Illinois, was impressed with the 
fact that certain individuals who had academic difficulties at that initial 
training school appeared with apparently great frequency on the attrition 
(failure) reports which came from the later training activities such as 
War Training Service Schools, Pre-Flight Schools and Naval Air Stations. 
Hence arose the question, “How is early academic, or ground school, suc- 
cess tied in with results of later training?” 


Method 


It was decided, therefore, to take two hundred cases of cadets who 
had been failed at a later training activity and compare them with two 
hundred other cadets who, according to NFPS files, had not been reported 
as “‘washouts.” The reason for failure was not considered material except 
that the cadets who were attrited for medical reasons were not included. 
The failure group will hereafter be called group X, and the two hundred 
other control cadets will be called group C. 

Group C was selected in the following manner. The cadet who im- 
mediately followed the ‘‘washout’”’ on the official platoon lists, and who 
had been graduated from the NFPS, Monmouth College, was selected 
as the control. Thus it was assured that the control cases came, in each 
case, from the same platoon.as the cadet who had been failed. This 
tended to keep the factor of quality of instruction received fairly constant, 
since cadets attended classes by platoons. In case the group X cadet was 
the last in the platoon, the cadet immediately preceding him was chosen 
as a control. In several instances, where two group X cadets followed 
each other, the cadet immediately preceding was used in the first instance 
asacontrol. Asstated above, only cadets, who, like the group X cadets, 
had been graduated, were used as controls or group C members. 

The next step was to secure the grades for each group from the offi- 
cial grade cards. In a study such as this, where exact comparison was 
necessary, it was decided not to use the re-examination grade if a cadet 
had failed a course. This was due to two reasons: (a) The re-examination 
grade would give one cadet an advantage over another siace he received 
two final examinations, whereas the control cadet received only one; (b) 
no matter what the degree of success in the re-examination, the cadet’s 
grade for the course, by official directive, was only 2.5, or just passing. 
Hence only actual course grades were used since it was felt that they rep- 
resented the true ability of the cadet. The ground training average 
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(GTA) or .veighted average of all course grades, was then recalculated 
to take into account the failing grade. 

In some instances cadets, for various reasons, were transfered back 
a battalion. In these cases, grades in the old battalion up until time of 
transfer were combined with grades in the new battalion after transfer 
to obtain the grade for the course. This was done to prevent a spuriously 
high average from appearing on the card, since it was a well known and 
easily observable fact that cadets’ grades improved materially when they 
were given a chance to repeat the work. 

It would be of interest here to note briefly the naval activities from 
which each of the 200 failing (group X) cadets were “washed out.”’ Table 
1 gives this distribution. 

Table 1 
Naval Activities from Which Failing Cadets Came 








N Jo 





WTS Schools 98 49.00 
Pre-Flight Schools 13 6.50 
Naval Air Stations 89 44.50 


200 100.00 





Table 2 presents the reasons for failure of these cadets. However, 
several cases were failures for a combination of reasons, but only the 
official reason is given above. 


Table 2 
Reasons for Failure of Failing Cadets 








oO 
N 70 





Psychological 1 50 
Flight ' 69.50 
Dropped own request 27 13.50 
Academic 23 11.50 
Discipline ) 4.50 
Flight and Academic 1 50 


200 100.00 





The battalions of the Monmouth NFPS in 1943, from which the 
Group X members came, were fairly representative, and pretty well 
scattered over the whole year, as Table 3 shows. Thus, each battalion 
contributed at least six per cent to the total and gave a good time sample 
of cadets for the year. There were thirteen battalions in attendance at 
Monmouth during the year, a new battalion entering each month. 
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Table 3 
Battalion Distribution of Failing Cadets 
N % 

Battalion 1—43 13 6.50 
Battalion 2—43 22 11.00 
Battalion 3—43 14 7.00 
Battalion 4—43 14 7.00 
Battalion 5—43 18 9.00 
Battalion 6—43 15 7.50 
Battalion 7—43 10 5.00 
Battalion 8—43 15 7.50 
Battalion 9—43 12 6.00 
Battalion 10—43 16 8.00 
Battalion 11—43 16 8.00 
Battalion 12—43 20 10.00 
Battalion 13—43 15 7.50 
200 100.00 

Results 


Failures. There were eight academic subjects in the curriculum. 
These were: Navigation, Recognition of Aircraft, Aircraft Engines, Aerol- 
ogy, Communications, Theory of Flight, Mathematics, and Physics. 
Table 4 indicates the distribution of course failures in both groups. 


Table 4 
Distribution of Course Failures Among Failing and Control Cadets 





Nav. Rec. Eng. Aer. Comm. Math. Fit. Phys. Total % 





Group X 20 2 1 12 6 25 6 17 89 5.56 
Group C 8 2 0 6 1 8 0 10 35 2.19 


s 





Thus of 1600 courses (200 cadets times 8 courses each), the number of 
course failures was more than two and one-half times as great among the 
failing or X Group as among the controls. Course for course, except in 
Recognition, the X’s have more failures than the C’s. The difference 
of 3.37% has a critical ratio of 4.81, significant at better than the one per 
cent level. 

While on the subject of course failures, we found it of interest to see 
what the degree of failure was among individuals, that is, how many in- 
dividuals in each group failed one course, two courses, etc. This com- 
parison is given in Table 5. 








1oal~ il 
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Table 5 
Comparison of Course Failures by Individuals Between Failing and Control Cadets 








Failed Failed Failed Failed Failed 
1 2 3 4 5 
Course Courses Courses Courses Courses Total % 





Group X 35 ll 5 3 1 55 27.50 
Group C 19 7 0 0 0 26 13.00 





Thus, more individuals failed courses, and more individuals failed 
more than one course among the X group than among the C group. The 
difference is 14.50% and the C.R. is 3.67, also significant at least at the 
1% level. (One individual, as a matter of fact, in the X group had a 
failing final G.T.A. He had failed four courses but graduated when he 
passed the re-examination.) 

Advisory Board. The commanding officer’s Advisory Board was 
constituted to call before it all cadets with academic or other difficulties 
to investigate the reasons therefore, and to take action on retention or 
dismissal of the cadet from the training program. It was thought ad- 
visable, therefore, to use Advisory Board appearances for academic 
difficulty as another criterion of success of cadets in training. Table 6 
presents the comparison between the two groups. 


Table 6 
Number of Cadets Meeting the Advisory Board 








Group No. % 





Group X 30 15.00 
8 


Group C 4.00 





Among the X Group, thirty men, or 15%, had met the Co’s Advisory 
Board because of academic difficulties whereas among the controls, only 
about one-fourth that number, or 4%, had met the Board for the ~ame 
reason. Thus almost four times as many X Group members were rec- 
ognized at the Flight Preparatory School with poor academic records. 
The difference is 11% and the C.R. is 4.78, significant again at least at 
the one per cent level. 

Course Grade Comparisons. A grade-by-grade comparison was made 
of each of the 200 X Group members with their controls. Table 7 below 
compares the two groups in terms of the means, differences between the 
means, and the critical ratios for each subject and for the final weighted 
ground training average. 
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Table 7 
Course Grade Comparisons of Failing and Control Cadets 
C Group X Group Diff. Bet. 

Subject Means Means Means CRs 
Navigation 3.19 3.03 16 4.00 
Recognition 3.23 3.13 10 3.33 
Engines 3.37 3.29 .08 2.71 
Aerology 3.09 3.02 07 1.94 
Communications 3.84 3.66 18 3.22 
Mathematics 3.26 3.19 07 1.50 
Flight 3.25 3.15 10 3.16 
Physics 3.18 3.12 06 1.38 
Gr. Tr. Av. 3.30 3.19 Al 4.31 





In each case the Group C mean is superior to the Group X mean. In 
only three instances may the differences be considered not significant 
statistically at least at the 1% level. These are Aerology, Mathematics 
and Physics. It is of intense practical interest to note that Navigation, 
upon which the Navy placed the greatest emphasis in pilot training and 
to which it devoted the greatest percentage of ground school training 
time, yields the highest difference between means and the highest indi- 
vidual course difference critical ratio. 

In addition to the above course grade comparison, a grade-by-grade 
contrast was made of each of the 200 X Group members with their C 
Group counterparts to determine whether the C Group member had a 
better course average than the comparable X Group member or vice 
versa. Table 8 below presents the results. In this table, the column 
headed “‘identical” means the averages were identical.’ 

Thus taking all 1600 courses as a unit, the C Group exhibits superior- 
ity over the X Group 54% of the time, whereas the X Group is superior 
to the C Group only 38% of the time and 65 times out of 100 they have 
better weighted ground training averages. 


Discussion 
It would have been possible to have delved deeper into this investiga- 
tion. One could have compared those cases which were washed out at 
Naval Air Stations with those who were failed at earlier activities. How- 


ever, it was noted that no cadet in any battalion prior to the sixth was 
failed at a WTS school, whereas beginning with the sixth battalion such 


1 The large number of identical averages in Communications is due to the grading 
system in that course which allows little spread and in which grades are arbitrary for 
certain speeds of reception. 
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Table 8 
Relative Sv.periority of Course Averages in Each Group for Failing and Control Cadets 











C Group X Group Identical 

N % N % N % 
Navigation 128 64.00 70 35.00 2 1.00 
Recognition 119 59.50 77 38.50 4 2.00 
Engines 111 55.50 83 41.50 6 3.00 
Aerology 105 52.50 91 45.50 4 2.00 
Communications 67 33.50 42 21.00 9 45.50 
Mathematics 104 52.00 93 46.50 3 1.50 
Flight 119 59.50 76 38.00 5 2.50 
Physics 109 54.50 89 44.50 2 1.00 
Total 862 53.88 621 38.31 117 7.31 
G.T.A. 128 64.00 70 35.00 2 1.00 





attrition began. It was believed many cadets who could have been at- 
trited at that stage were held over and washed out at the Naval Air Sta- 
tion. The reason for this probably lay in the early reluctance of WTS 
schools to fail cadets. 

A comparison could also have been made of flight failures with other 
types of failure. The number of cases is limited in this study, however. 
But it is of interest to note that a trend exists, and since the flight failures 
make up about three-fourths of the group studied, it is reasonable to as- 
sume that the trend would have been found among cadets who failed 
purely for flight reasons when compared with controls. 

It is of importance to mention that some of the control cases may con- 
ceivably have ‘“‘washed out” at later stages. Then, from the standpoint 
of this study they would have become “experimental’’ cases and new con- 
trols would have had to be selected for them. However, it is 
significant to note that up to the point of investigation a trend seemed 
to exist in favor of the control group. Perhaps the dichotomy which some 
individuals believe extant between flight success and ground school suc- 
cess is not as severe as is supposed. This is merely one suggestion offered 
by this study. At any rate, there seems to be corroboration of the fact 
long established by psychologists—namely, that there is a general posi- 
tive correlation among “better” human traits. In any case, it would seem 
that the general policy (with exceptions in specific cases, of course) of 
discouraging retention of the weak student very early in flight training is 
a sound one. 
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Summary 


1. Two hundred cases of cadets who failed at a naval aviation training 
activity beyond flight preparatory school were compared academically 
with two hundred control cases who has as yet not failed. 

2. The attrited or “washed out” cadets had a significantly greater 
percentage of course failures than the controls. 

3. The control group met the Advisory Board at the school for aca- 
demic difficulties with lesser frequency, the difference being statistically 
significant. 

4. The control group had superior course averages in 54% of the 
courses, the failed cadets were superior in 39%, while the group were 
ba in 7% of the courses. 

5. Sixty-five times out of a hundred the weighted Ground Training 
Average were superior for the controls. 

6. In all courses, grades for the controls were superior r to those for the 
failing group. In Navigation, Recognition, Aircraft Engines, Communi- 
cations, and Theory of Flight courses, the superiority of these averages 
was statisticaliy significant at least at the 1% level. The differences 
were not significant in Aerology, Mathematics, and Physics. The ground 
Training Average, which is the weighted average of all courses, and repre- 
sents final accomplishment in three months of flight preparatory training, 
is significantly superior for the control group. 

Received July 18, 1947. 
Early publication. 
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Despite the importance of vision in many types of industrial work 
there is only meager experimental evidence on the effect of fundamental 
factors, such as quantity and quality of illumination, in visual perfor- 
mance and visual fatigue. This is primarily due to the difficulties en- 
countered in providing suitable arrangements for quantitative study of 
prolonged visual work under rigorously controlled conditions. 

Although the effect of illumination on output can often be readily 
demonstrated, most types of industrial work are not suitable for ex- 
perimental investigations on visual performance. In addition to vision, 
industrial work involves as a rule other functions, such as manual dex- 
terity. One cannot expect changes in illumination to have a direct effect 
on the level of output unless the visual components of the job are the 
limiting factors in the performance. Finally, the physiological effects 
of experimentally varied illumination can be masked by changes in out- 
put resulting from training and uncontrolled variations in motivation. 
A clear analysis of these facts has not always been made in designing 
experiments and in evaluating the complex test situations in laboratory 
or in industry (6). 

A suitable test of visual performance should meet the following re- 
quirements: (1) Well standardized work task; (2) Possibility of easy 
quantitative evaluation; (3) Clearcut relationship to fundamental visual 
functions; (4) Elimination of auxiliary functions, such as manual skill or 
verbal intelligence, which might influence the results to any significant 
degree; (5) Elimination, as far as possible, of uncontrolled factors affect- 
ing performance, particularly continued training; (6) Practical applica- 
bility of results; (7) Possibility to vary critical factors of performance. 


* This work was supported in part by a research grant from the Verd-A-Ray Cor- 
poration, Toledo, Ohio. We wish to express our indebtedness to Mr. O. 8S. Levi, Re- 
search Director of the Verd-A-Ray Corporation, for preparation and calibration of the 
bulbs used in this experiment. 
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At this laboratory a work test for the study of visual performance and 
fatigue has been developed which we believe meets the above require- 
ments. 


Technical Description 


The visual work consists of recognizing single letters (point size 3.5, ty 
face Gothic No. 25A), which pass through a slit. The visual angle of the 
letters is about 10 minutes, which at low levels of illumination (2 to 5 F.C.) is 
only slightly above the threshold of recognition. All letters of the alphabet 
were included but only the capital forms were used. The spacing of the letters 
in the horizontal as well as in the vertical direction was irregular. The letters 
were copied down by hand. 

The letters are printed on strips, 4 inches high and varying in length from 
1% to 2% inches, mounted on a rubber belt in such a way that there is overlap 
and all of the belt is covered. The mounting in small paper strips is done for 
a number of reasons: to avoid wrinkling, to provide for easy replacing of dam- 
aged parts, and for changing the penne of letters in the case of formation 
of meaningful letter patterns. “Vith thi ob prt taken there is no danger of 
memorization of the irregularly spaced letters. In our arrangement there 
were 776 letters on the belt and the subject did not know when the sequence 
began to repeat itself; even the assistant who evaluated the scores by using a 
master sheet did not memorize the sequence in 3 months of intensive work. 
The whole belt (25 feet) revolves in 23 minutes, so that the linear speed is 
0.218 inch per second. The time from the moment an 0.049 inch letter begins 
to =a the 0.125 inch wide slit to the moment it completely disappears is 
0.573 sec. 

The rubber belt is rs by two wooden drums, especially built for 
this purpose, with a slightly convex surface to increase friction. Figure 1 
shows one of the drums. Both drums are mounted on a solid wooden chassis, 
with adjustable tension on the belt. The belt is su posted along the length 
of its bottom 1 by a groove. It is driven by a v4 -p. synchronous motor, 
connected with the driving drum by means of several variable reduction gears, 
which qpevite a rather wide range of - engene On the outside the belt is pro- 
tected by a metal and paper cover. The metal shield, located in front of the 
subject, contains a vertical slit 4 by 3 inches. The rubber belt is pressed 
2 the sides of the slit by means of rolls which are under spring tension. 
The slit is somewhat wider towards the observer in order to prevent shading 
of the narrow slit area. Both the metal shield and the paper on which the 
letters are printed are dull white, so that the contrast between the surrounding 
field and the target area is reduced to a minimum. 

The subjects are seated in individual booths, three on each side of the table. 
The depth of the working place from the border of the table to the screen is 
about 17 inches and the width 41 inches. It can be seen that the working 
place is spacious and the whole arrangement is as comfortable for the subject 
as could be reasonably expected. The table slants slightly downward in order 
that the subject may maintain a convenient ture during the test. The 
chairs have back supports and are padded. Figure 1, left side, shows the 
partitions between the booths; on the right side the partitions are removed. 

Figure 2 gives the overall arrangement of the working places, with the 
partitions removed. Each booth is equipped with a light on the left side of 
the subject, a device for letter ne. on the right side, and a head rest 
keeping the distance from the eyes to the slit constant. The head support 
was mounted at an angle to the screen, as can be seen in Figure 3. The con- 
stant distance between the eyes and the slit in the screen was 21 inches. 
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Fic. 1. The drum transports a rubber belt on which are mounted paper strips with 
the letters. The left side shows partitions separating the booths. The cellotex sheet 
is removed on the right side in order to make visible a metal screen with the slit. 


Fic. 2. Booths with the partitions removed. The lamps illuminate 
the slits from the left. 
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Fie. 3. Arrangement of the working place. Metal box, set at an angle of 60° to 
the plane of the screen, contains a lamp. There is an elbow support projecting from 
the table, rubber padded head rest, anc on the right-hand side of the booth a device for 
recording letters. 








mage Pay a 


a my 











Fic. 4. One of the subjects during work. The diaphragm with perforations, used 
for the regulation of illumination, is visible in front of the metal box containing a lamp. 
The subject looks slightiy downward, while recording semi-automatically the letters 
passing through the slit. 
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Figure 3 shows one working place from above. The longitudinal axis of 
the metal box containing lamps forms an angle of 60° with the plane of the 
screen. The lamp socket is mounted on a rod extending through the back side 
of the box; this arrangement is used to bring the tip of the bulbs, differing in 
size, in contact with the diaphragm, in order that the distribution of light in 
the field surrounding the slit may remain constant. The whole box can be 
moved a few inches up and down, and over a distance of 10 inches to and from 
the screen, the axis of the box remaining at the same angle with the screen. 
The head is sufficiently distant from the lamp so that even with 200 watt bulbs 
there are no disturbing effects of heat. An asbestos shield (seen in Figure 3) 
protects the subject from heat and direct light radiation. 

Change of the distance of the lamp from the screen affects both the illumi- 
nation level on the target area and the distribution of light in the field sur- 
rounding the slit. The importance of the surrounding field for recognition of 
details in the central field has been well recognized (3,9). In order to compare 
different lamps it is desirable to change the distance as little as possible. For 
this reason, we modified the illumination intensity in part by inserting in front 
of the lamps interchangeable diaphragms, made of metal plates with multiple 
perforations. By varying the number and size of the openings a satisfactory 
adjustment can be made over a large range of illumination levels. Lamps of 
different wattage are used to extend the range of illumination still further. 
It was possible to obtain a range of illumination from 2 to 300 foot-candles 
(F.C.) on the screen by using 40 watt, 100 watt, and 200 watt lamps with 
different diaphragms. 

A a booth was constructed to determine the three parameters— 
diaphragm, wattage, and distance—which would yield the desired illumination 
intensity on the slit. This is an exact duplicate of the booths used, except 
that the screen is replaced by a Western Electric photometer, located in the 
exact position of the slit. By varying the diaphragms and the wattage of the 
lamps, the desired illumination level is obtained with comparatively little 
variation of the distance. Since minor differences in the position of the lamp 
or the mounting of the boxes are unavoidable, the final adjustment of the 
illumination factors is always done at the individual booths. The lamps are 
placed at the distance determined in the special booth; the final adjustment is 
made by a variation of the distance of the box from the screen. Since the 
F.C. obtained on the screen were usually very nearly (often identical with) 
that required, the necessary adjustments of the distance were very small, i.e. 
fractions of 1 inch. 

In regard to the calibration of the illumination level on the screen, it is 
important to consider that the technical allowance for differences in lumen 
output in commercial illuminants is 5%. For experimental work it is desirable 
to use lamps which differ as little as possible in this respect. The lumen 
output of 120 bulbs of the 3 types (Verd-A-Ray, frosted, natural white) and 
3 sizes of the lamps used was measured at the research laboratory of the 
Verd-A-Ray re sr in Toledo, Ohio. Thus it was possible to select 
lamps which differed in their lumen output by less than 1%. While this 
procedure is preferable, a 5% variability of the lumen output would not prevent 
a satisfactory standardization, since a rather small distance variation would 
compensate for such a difference. 

t one of the drums transporting the rubber belt with the letters a metal 
pointer is mounted to serve as a guide post in identifying sections of the letter 
sequence. Before he starts to work the subject observes for a few minutes 
letters passing through the slit; he begins to write down the letters immediately 
after the command “go” has been given. The experimenter notes the starting 
point for each subject. 

Figure 4 shows one of the subjects while working, reclining against the back 
support, the left elbow resting on the arm support, and the forehead touching 
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the head mprert. In focusing on the slit the subject is lookin slightly down- 
ward, which is a more natural position for most jobs than fookin straight 
ahead or upward. The instrument for recording letters consists of a roll of 
paper, of the kind used in calculating machines, which is led beneath a metal 
plate containing a small open window, and wound up by the subject. In 
several of these instruments the — was transported by pressing down a 
lever, in others by turning a knob. The writing of the letters and the transport 
of the paper was done in a semi-automatic way, without visual control. Neither 
operation requires any appreciable manual skill. Satisfactory transcription 


without looking away from the screen was achieved by all subjects after one or 
two trials. 


For the purposes of evaluating the performance, work samples of 
200 letters are used; the score is expressed as the number of letters cor- 
rectly recognized. For a 2-hour work period, which was the standard 
work time, three work samples of about six minutes each were taken. 
The first sample was taken near the start, the second one at the midpoint, 
and the third one near the end of the 2-hour work speli. The subjects 


did not know this arrangement and believed that the whole period was 
scored. 


Effect of Practice 


The subjects were given a chance to become well acquainted with the 
task, to get adjusted to the work place, and to master the mechanical 
aspects of the work, such as moving the lever of the recording device, 
writing down the letters without taking the eyes off the screen, exchang- 
ing the pencils, etc. After some preliminary experiments, necessary 
for deciding such technical details as the speed of the letter belt, four train- 
ing trials were given. The subjects worked on one day for two half-hour 
periods, separated by a 15 min. rest period, and for two one-hour periods 
on two following days. 

This amount of practice was sufficient to reach a performance plateau 
as indicated in Figure 5. When the six subjects were retested under 
conditions I, II, and III, referring to the three types of illuminants used, 
_ no consistent increase in the average scores was observed. The average 
scores for the first three trials was 166.7, for the second three trials 168.1. 
Considering the scores obtained in trials repeated with the same illumi- 
nant (1 and 4, 2 and 5, 3 and 6) as paired variates, Y, and Y2, we obtained 
the value of the F test, equivalent to @, 


ae (2d)?/n 
~ Yd? — (Zd)?/n 
n-—-1l 


= 0.554 








where d = (Y; — Y:),n = number of pairs. The value of F is associated 
with 1 and (n — 1) = 17 degrees of freedom, and does not approach the 
5% point of significance,"4.45. 
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Fic. 5. Performance scores obtained in the first series of six 2-hour work sessions. 
The individual scores are based on work samples taken at the start, mid-point, and end 
of each 2-hour work spell. The work samples contain 200 letters. Conditions I, I, 
III refer to three types of illuminants. Illumination intensity = 5 F.C. Number of 
subjects = 6. Heavy lines represent mean scores; light lines represent +1. 8.D. 





Consistency of the Test Results 


For the purpose of characterizing the consistency of the test scores on 
repeated testing we shall use data obtained in the third set of six testing 
sessions. As there was no significant difference at this illumination level 
between the three illuminants used (conditions I, II, III) the block of 
scores (Table 1) will be considered as a unit. 


We shall use two criteria of consistency (2): 


(1) The random variance, cr’, expressed as percentage of the total 
estimated composite variance, o? = o;? + op? + opr’. 

(2) The coefficient of day-to-day consistency of the scores, rc, not 
affected by changes in the trial (daily) means: 


or eae Fa 
or + or’ Vp 
The terms 7’, cp’, and ez’ represent components of the total estimated 


variance for the population, contributed by the differences between in- 
dividual scores, between daily (trial) scores, and by random fluctuation of 





tc = 
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Table 1 


Performance in Successive Testing Trials Repeated under Comparable Conditions 


Note: The scores are averages based on the number of letters correctly recognized 
in three 6-minute periods of a 2-hour work spell. The work samples in each period 


contained 200 letters. Illumination intensity = 100 F.C. 











Trial 

Sub- 

ject 1 2 3 4 5 6 Tr Y, 
H 182 184 182 193 181 183 1105 184.2 
R 192 193 195 191 197 195 1163 193.8 
Oo 195 196 198 193 193 196 1171 195.2 
P 173 182 186 190 181 188 1100 183.3 
B 191 198 195 197 198 198 1177 = 196.2 
Vv 182 185 181 186 184 180 1098 183.0 
Tp 1115 1138 1137 1150 1134 1140 T = 6814 
Yo 185.8 189.7 189.5 191.7 189.0 190.0 Y = 189.3 





Y = any score; n = number of subjects; k = number of trials. 


k 
T; = 2 Y; = individual totals; 7; = individual means. 
1 


Tp = z Yp = daily (trial) totals; 7p = daily means. 
1 
T = grand total, 7 = grand mean. 
the scores, respectively. They are defined as follows: 


oz” = Ve, of = Vop _ Vr, cp’ = Vor — Vz. 


The scheme for computing the values Vp, Vur, and Ve is indicated 


in Table 2. 
Using the V values from Table 2 we obtain: 


| 
; 
i 
. 





moderately high consistency of the repeated measurements. 










8 
To = 50.3 = 0.77 : 


scores. 





on? = 11.5, of = 50.3 — 11.5 = 38.8, op’ = 13.2 — 11.5 = 


The coefficient of the day-to-day consistency of the scores, rc, equals: 


1.7. 





In terms of the estimated total composite variance (o? = c”?+¢7 
+ op’ = 52.0), cn? represents 22.1% of the total. This would indicate a 





providing another index of the moderately high consistency of the test 
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Table 2 
Scheme for Computing Variances Vyp, Vor, and Ve 








> (71)? 


nk 
=, = = Y? = 1,291,358 =- E— = 1,290,961 
1 


N,; = nk = 36 N2=n=6 





k 

= (T'o)* ns r 
233 = n = 28 ,849 uM 
N;=k=6 N,=1 


= 1,289,739 





In terms of the above defined symbols, 2 and N, we obtain: 


1) Variance based on the differences between individual values within days, 
21 — 2s 1509 

Veo = WV. —N, = 30 = 503. 

2) Variance based on the differences between daily measurements within 


inilivtduale, Vex St on 8 182. 


Ni—-N:z 30 
3) Variance based on the random fluctuations of the scores, 
V Zi — 2% —- Ut _ 287 


a le > oe ~ ee oe 95 = 11.5. 





Sensitivity to Experimentally Varied Conditions 


To be useful for experimental work, a test must be reasonably reliable 
in the sense of yielding consistent individual values when the testing is 
repeated under standard conditions. In addition, it has to be sensitive 
to the experimental variations in the environmental and physiological 
conditions. 

As an illustration we shall use the effect of working under three levels 
of illumination of 2, 5, and 50 foot-candles, respectively. These refer to 
the illumination levels at the plane of the slit through which the letters 
are passing. 

The results are given in Figure 6. The effect of more adequate illu- 
mination on performance is reflected in three different ways: higher gen- 
eral level of performance, smaller fatigue decrement in output, and smaller 
inter-individual variability. 

At the three levels of illumination (2, 5, and 50 F.C.) the scores, at the 
start of the work-spell, averaged for six subjects, were 162.0, 179.2, 193.7, 
respectively. Using the F test, we obtain for the differences between 2 
and 5 foot-candles an F value of 37.60,** for 2 and 50 foot-candles F 
= 72.40,** for 5 and 50 foot-candles F = 13.87;* the F’values for 1 and 5 

* One asterisk indicates significance of the given F-test at the 5% level. 


** Two asterisks indicate that the given value of F has reached or exceeded the 1% 
level of significance. 
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degrees of freedom are 6.61 at the 5% level, 16.26 at the 1% level. This 
indicates that within the given range of illumination intensities there is a 
significant rise in the level of perforr: :nce with an increase in the illumi- 
nation level. The sequence of the experiments was as follows: 5 F.C., 
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Fria. 6. Performance score (maximum = 200) at the initial (i), middle (m), and 
terminal (t) periods of a 2-hour work spell at three levels of illumination intensity. The 
points above and below the circles indicate +18.D. Number of subjects = 6. 





50 F.C., and 2 F.C. Therefore, the effects of the level of illumination 
cannot be explained as a result of accidental and uncontrolled factors— 
particularly changes in motivation and continued practice—which viti- 
ated most of the extensive research carried out in the 1920’s under the 
supervision of the committee on industrial illumination of the division 
of engineering and research of the National Research Council (6). 

When the intensive visual work has been continued for some time, 
without any rest period, the performance deteriorates. The simplest 
index of the magnitude of fatigue is the score decrement from the initial 
to the terminal sample of 200 letters, taken at the start aad toward the 
end of the 2-hour working period. This decrement averaged 26.0 letters 
at 2 F.C., 14.5 letters at 5 F.C., and 10.2 letters at 50 F.C. The re- 
spective values of the F tests are 4.59 for 2 and 5 foot-candles, 8.13* for 
2 and 50 foot-candles, and 3.05 for 5 and 50 foot-candles; the F value is 
again 6.61 and 16.26, respectively, at the 5% and 1% levels of significance. 
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The fatigue decrement is progressively smaller as the illumination level 
increases, although only the difference between 2 and 50 foot-candles is 
statistically significant. 

The inter-individual variation increases as the level of illumination 
decreases. In other words, less favorable conditions of illumination 
emphasize the individual differences. The F-ratios were computed for 
all three levels of illumination at the initial, middle and terminal period 
of the work-spell. Table 3 indicates that the inter-individual variances 
at lower levels of illumination are consistently larger for all three work 
periods. This fact is more important than the statistical significance of 
the single F ratios. 


Table 3 


Ratios of the Variances of Scores Obtained in the Same Periods of the Work Spell at 
Three Levels of Illumination Intensity 








Period 





Variance Ratios Initial Middle Terminal 





2 F.C./5 F.C. 1.29 4.12 1.02 
2 F.C./50 F.C. 9.02* 7.15* 2.56 
5 F.C./50 F.C. 6.96" 1.74 2.52 





For 5 and 5 degrees of freedoms the F value at the 5% level of significance is 5.05, 
at the 1% level 10.97. P 


There is also a tendency for the variances at each level of illumination 
to increase with the duration of the work (Table 4). The only exception 
occurs at 2 F.C. where the variance increases at the middle but decreases 
somewhat toward the end of the work-spell, although the terminal value 
is still nearly three times larger than the initial value. 


Table 4 


Ratios of the Variances of Scores Obtained in the Three Periods—lInitial, Middle, 
and Terminal—of the 2-Hour Werk Spell at Various Levels 
of Illumination Intensity 





Foot Candles 





Variance Ratios 2 5 50 





Middle/Initial period 5.86 1.84 7.39* 
Terminal/Initial period 2.89 3.67 10.15* 





For 5 and 5 degrees of freedom, the value of the F-ratio at the 5% level of significance 
is 5.05, at the 1% level 10.97. 
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Thus both types of “stress,” the less favorable illumination and the 
fatigue resulting from an intensive uninterrupted work, result in an in- 
creased inter-individual variability. 

Discussion 

In many types of work producing visual strain discrimination of fine 
details is the critical’ visual function. This does not mean that other 
visual functions, e.g., eye muscle coordination, are not also involved. 
However, these factors play an auxiliary rather than a fundamental role; 
this should also be the case in any experimental test designed as a lab- 
oratory version of an industrial work situation. In their studies on visual 
fatigue McFarland, Holway and Hurvich (5) used a work test involving 
primarily the function of accommodation; for 30 minutes the subjects 
continued to shift the fixation point from far to near. There is no ques- 
tion that such an eye exercise will produce severe fatigue of the intrinsic 
eye muscles, but this type of muscular work has little, if any, practical 
application. A work test suitable for laboratory investigations of vision 
should be in principle a visual acuity test, with the target size and the 
distance of test objects from the eye arranged in such a way that the 
recognition of details represents a hard task. This may be arranged by 
having the size of details a little above the threshold size. Under these 
conditions the effects of a 2-hour work spell at the lower levels of illu- 
mination may be considered comparable to visual fatigue resulting from 
8 hours of industrial inspection work.! 

In Bogoslavski’s work (1) sorting rice was used as a visual work test. 
The criteria for sorting rice are obviously complex. Although this type 
of work does decrease the electrical excitability of the retina, a phenom- 
enon used as an index of “visual fatigue,” the work cannot be adequately 
standardized and the performance score, which also depends on manual 
skill, would be of questionable significance. 

Luckiesh and Moss (4) used reading as a visual work test. Investi- 
gations on such a common type of human activity are of great interest. 
However, several objections may be raised when reading is used as an 
approach to study visual fatigue. In ordinary reading the letters are far 
above threshold so that no considerable visual strain is involved, at least 
for relatively short periods of reading. There are other, more serious 
disadvantages. The reading performance is not a “‘purely’’ visual test, 
since it involves a significant mental component. It is difficult to obtain 
a meaningful index of performance other than the raw speed of reading, 


11t was difficult to persuade our subjects to work for 4 hours; it should be noted 
that these men went without grumbling through such stresses as experimental malaria 
and hard work in heat. 
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particularly when the work must extend over a period of hours and a 
large series of experiments. The speed of reading depends on the content, 
on the intellective level of the reader, and on individual initiative. 

This does not exclude that the rate of reading may be affected signifi- 
cantly by the illumination intensity (7). However, the rate of reading 
is a complex criterion and the interpretation of changes under conditions 
of fatigue is difficult. 

The recognition of Landolt rings, such as used in a study by Weston 
(8), appears to be a more suitable technique than reading. Small circles, 
broken at different points of the circumference, were printed on test sheets 
and the task involved discrimination and cancellation of some of the rings 
according to the orientation of the opening in the ring. This type of work 
has a minimum of intellective content. 

The advantages of the method described in this paper are that the 
individual initiative affecting the speed of performance is eliminated, the 
auxiliary visual functions play a rather small role, and the uniformity of 
the experimental conditions can be easily maintained and duplicated. 
The speed with which the letters pass through the slit can be varied. It 
is one of the critical variables in our arrangement, as is the case in many 
industrial inspection operations in which conveyor transport is used. 
This affords, as Weston’s arrangement does not, an opportunity to study 
the effect of varying experimentally the speed of visual work. Since 
in our work test the number of letters presented in a given time depends 
only on the speed of the belt and is constant for all subjects, ‘output’ 
can be characterized in terms of correctly recognized letters or in terms of 
errors and omissiotis. 

The experimental data on the effects of different levels of illumination 
are intended only as an illustration. The test is suitable for investiga- 
tions of a large array of variables affecting visual performance, including 
the various visual factors, environmental conditions, and duration of 
work. It may be valuable in the study of patients with such ophthal- 
mological disorders as abnormal refraction or retinal changes. Com- 
plaints of visual strain are very common, but they never have been sys- 
tematically studied in a quantitative visual performance test situation. 


Summary 


1. A test has been developed for experimental investigations of visual 
performance and fatigue. It reproduces the essential features of an in- 


dustrial conveyor inspection-operation and can serve as a “miniature job 
situation.” 


*A more complete study of the effect of illumination intensity between 2 and 300 
F.C. on visual performance and fatigue will be published elsewhere. 
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2. The visual task consists in the recognition of letters which are pre- 
sented in random order on a belt moving behind a narrow slit. Difficulty 
of the task is adjustable by using different letter sizes, different speeds, 
or different contrast. 


3. The test may be used as a standard method of producing visual 
“strain” or as a means of studying the effect of different variables, in- 
wluding illumination, on performance and on visual fatigue. 

4. Manual and intellective factors have minimal influence. The 
effect of training is very small. 


5. Separate applications of the test to the same individual under 
standard conditions yield reasonably constant results. 

6. The test is sensitive to changes in the fundamental variables affect- 
ing visual performance, such as intensity of illumination. 


Received November 18, 1946. 
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Students’ Honesty in Correcting Grading Errors 


William C. F. Krueger 
Wayne University, Detroit, Michigan 


This series of experiments developed in an endeavor to note the hon- 
esty of students when they corrected grading errors. The first study was 
conducted in the following manner. 

During the semester, 13 weekly true-false tests, each of which covered 
the respective week’s assignment, were given to 129 students of Intro- 
ductory Psychology. The tests were checked and graded by the Right- 
Minus-Wrong formula. Since each test contained twenty-five test items, 
each wrong answer deducted eight points ffom the perfect score of one 
hundred, and eight points seemed to be a lot of points to the students. 
The graded papers were returned to the students at the next meeting of 
the class. At that time every test item was read, discussed, and checked 
for correctness. While the students had their test papers they were urged 
to check the papers carefully. They were told to make a note on the 
test papers referring to any “errors” made by the person who originally 
graded the answer sheets. Special emphasis was placed upon the iden- 
tification of ALL errors in BOTH directions, namely grading errors, which 
when corrected, would raise the final grade, as well as errors, which when 
corrected, would lower the final score. All answer sheets were collected 
again for final adjustment of the grade as reported by the students. Dur- 
ing the term 200 grading “errors’’ were purposely made at random. Only 
one “error” was made on any paper at any one time. Sometimes a cor- 
rect answer was marked as “wrong,” while some answers were scored as 
“right” when they actually were wrong. Whenever a student inquired 
about his grade on any paper, he was always given the grade as “‘cor- 
rected” by him. Since no final adjustments were made until the semester 
was completed, the only grades available were those as corrected by the 
students. 

Of the 111 items scored incorrectly but in favor of the students, 103 
answers were left unchanged by the students, and only 8 scores were 
lowered to the correct grade. Of the 89 errors made to the disadvantage 
of the students, 80 were returned with the errors corrected and the grades 
raised correspondingly. For reasons unknown to the writer the remain- 
ing nine papers were left as marked to the disadvantage of the students. 
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Of these nine papers seven grades were_ hopelessly low. Perhaps the 
students thought that a change would not raise the grade significantly, 
and therefore they may have failed to raise the respective scores. 

The following semester another plan was used. This time six of each 
student’s thirteen test papers were scored incorrectly. Three times the 
grade returned was higher than deserved, and three times the score was 
too low. The direction of the “errors” was counterbalanced. A group 
of 80 students in Introductory Psychology served as subjects. Another 
change was introduced. Two sets of books were kept. One set con- 
tained the grades as “corrected” by the students, while the other set 
recorded the properly corrected final scores. Whenever a student 
checked on his grades, the former book was used with the seemingly 
casual yet purposed comment: “The score book shows that you made a 
grade of. ...” The grade then mentioned was always the grade as 
“corrected”’ by the student himself. One fact became rather obvious. 
The students inquired in far greater numbers than ever before, and with 
seemingly greater interest and satisfaction. Perhaps the students were 
encouraged in the belief that they were “getting by” and were putting 
something over on the instructor. The subjects were never informed that 
they were participating in an experiment. while the experiment was in 
progress. However, later a very few students were told of the study. 

The following results obtained when the original grades were seemingly 
to the student’s advantage: 


Ist time ... 68 (85%) left grade too high, while 12 (15%) lowered grade 
2nd time ... 71 (89%) left grade too high, while 9 (11%) lowered grade 
ord time ... 76 (95%) left grade too high, while 4 ( 4%) lowered grade 


When the returned paper showed a score to the student’s disadvan- 
tage, another trend was noted. 


Ist time ... 6 (7%) left grade too low, while 74 (93%) raised grade 
2nd time ... 4 (5%) left grade too low, while 76 (95%) raised grade 
3rd time ... 1 (1%) left grade too low, while 79 (99%) raised grade 


Obviously, as the experiment continued, the subject became more 
“alert” to the errors made to their disadvantage while the opposite trend 
developed when the errors were to their advantage. Perhaps the system 
of double book-keeping may have contributed to the observed. trends. 

The third experiment was conducted several semesters later. An- 
other group of 80 subjects, students in Introductory Psychology, was em- 
ployed. Some changes from the preceding arrangements were intro- 
duced. At first only two papers were scored incorrectly. Once the score. 
was too high and the other time the grade was too low. By the time the 





Students’ Honesty in Correcting Grading Errors 535 





fourth of the thirteen tests had been given to the subjects two papers had 
been treated as stated above. Again the subjects “corrected” the ‘‘grad- 
ing errors.” The same comments with respect to accuracy in checking 
were made. Again the subjects were urged to note ALL errors in BOTH 
directions. This third group corrected the alleged grading errors with 
the same tendencies as were observed earlier in the preceding experiments. 
The same style of “double bookkeeping” was continued. Then a change 
was introduced. 

Before the fifth set of papers was returned for study and correction by 
the subjects, they were told of their own inaccuracies and of the trends 
noted with the two earlier groups. With more or less subtle hints the 
students were informed repeatedly that their honesty was really at stake, 
and that nobody had fooled the experimenter. Beginning with the fifth 
test again two additional papers were scored too high and two other papers 
were scored too low. Thus six papers of the thirteen test papers con- 
tained “grading errors.” 

After the students had been informed concerning the purpose of the 
experiment a definite change in the manner of correcting grading errors 
was noted. When the papers were scored higher than they should have 
been graded, the following results were obtained. The Ist time refers to 
the grading before the “talk” was giving concerning student honesty while 
the 2nd time and 3rd times followed the making of the comments. 


Ist time ... 72 (90%) left grade too high, while 8 (10%) lowered grade 
2nd time ... 1( 1%) left grade too high, while 79 (99%) lowered grade 
3rd time ... 1( 1%) left grade too high, while 79 (99%) lowered grade 


When the tests were graded too low, the results were these. 


Ist time ... 4 (5%) left grade too low, while 76 ( 95%) raised grade 
2nd time ... 0 (0%) left grade too low, while 80 (100%) raised grade 
3rd time ... 1 (1%) left grade too low, while 79 ( 99%) raised grade 


Evidently the information given after the fourth paper was returned had 
the effect of correcting the alleged grading errors more efficiently. It 
may well be that, when the subject was permitted “to get by,’’ the result 
was an increasing tendency toward greater inaccuracy. The knowledge 
that the subjects were “being experimented upon” and that the honesty 
of the subject was at stake may account for the definite change toward 
the high degree of accuracy in the correction of “grading errors’ even 
though the final grade would be lower. Conversations with students 
after the last experiment was completed, justified the above inferences. 


Received December 6, 1946. 














Critique of Van Allyn’s System of Vocational Counseling 


Frances Oralind Triggs 
New York City, N. Y. 


The Job Qualifications Inventory and the Job Placement Reference (5) 
shows a somewhat different approach to the “analysis of an individual’s 
actual and potential job qualifications, and for comparison of these find- 
ings with a correlated study of occupational requirements” than has been 
used by other authors. The purpose of the Job Qualifications Inventory 
(JQI) is very broad according to following statements: 

The Job Qualifications Inventory has numerous uses, being at one time a 
gauge of ability for use in personnel management, and an indication of voca- 
tional aptitude to aid in counseling inexperienced youth or the occupationally 
maladjusted adult. To personnel directors, it will supply weds graphic 
data relating to the individual’s ambitions, training, experience, and accom- 

lishment. School counselors will find it equally effective in evaluating 
tmmature experience and the less tangible tendencies and intentions which 
may be vocationally significant (5, p. 1). 

The author also states that ‘. . . the Job Qualifications Inventory is 
concerned with vocational interest, but performs the much broader func- 
tion of verifying the expressed interests and revealing actual job qualifi- 
cations—potential as well as achieved” (5, p. 17). This objective is 
attained, according to the author, by including in the Inventory “six 
basic questions in each of its thirty-five groups (which) are designed to 
prove or deny any interest which may be expressed in addition to indicating 
the presence or absence of training, experience, and accomplishment” 
(5, p. 2), 

The Inventory consists of thirty-five sets of questions each set in- 
cluding six questions covering an occupational area. The six questions 
always follow the same pattern: (1), preference for a type of activity; 
(2), proof of preference by listing actual performance of a related 
activity; (3), training or education in the activity mentioned in one 
and two; (4), ambition as a life work and its relation to this activity; 
(5), paid experience in this activity; and (6), evidence of accomplishment 
or recognition suggestive of “unusual proficiency.” 

Each of the thirty-five sets of questions has a key letter: “A” or “P” or 
“p,.” 

After the JQI has been answered it is scored by giving one point to 
each affirmative answer in the Inventory and then determining the Oc- 
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cupational Key “by means of which the individual’s qualifications or 
aptitudes may be compared with job specifications in the Job Placement 
Reference” (JPR) (5, p. 4). JPRis an encyclopedia of occupations whose 
requirements are expressed in terms of the same basic elements which the 
JQI reveals, in individuals, as qualifications (5, p. 5). The scores are re- 
- corded on the Profile on the first page of the JQI. The profile has the 
following headings: Tools, Machines, Dexterity, Strength, Drafting, Ac- 
counting, Mathematics, Records, Grammar, Art, Foods, Biology, Serv- 
ices, Animals, Agriculture, Speech, Entertainment, Writing, Research, 
Selling, Administration, Humanities, Investigation, Music, Languages, 
Religion, Minerals, Electricity, Mechanics, Astronomy, Construction, 
Chemistry, Physics, Athletics, Vehicles. The letter key of the set or sets 
questions receiving the largest number of affirmative answers are the 
first letter or letters of the “keys.”’ These are followed by a hyphen and 
then the sets receiving the second largest number of affirmative answers 
are placed in alphabetical order after the key or keys of the highest scoring 
sets. These keys are then interpreted by reference to the Reference 
Section of the JPR. This Reference lists occupations according to fam- 
ilies as listed in the Dictionary of Occupational Titles. 

Before examining closely the philosophy of the author of this instru- 
ment, and the adequacy of this instrument according to the purposes as 
stated by the author, there should be a word of explanation concerning 
the JPR itself. 

This reference is divided into three main sections, The Key Index, 
the Reference Section, and the Occupational Index. Thus, if the user 
of the scored JQI wants to determine what a key means, say A-CEFGX., 
he could look up that combination of letters in the Key Index in which 
the first letter of the possible keys are listed in alphabetical order, and 
find that this key refers to “Elements” as follows: Tools—Dexterity, 
Drafting, Accounting, Mathematics, Electricity. He could refer to the 
Reference Section and find this Key again in alphabetical order and under 
the same heading he would find that the key referred to Electrical Oc- 
cupations (Skilled), the USES Code Number, and Occupational Title and 
Industry headings from the USES family. The families are not followed 
exactly. There is often some slight deviation. In this section also each 
key has a group number. The next section is the alphabetical listing of 
the Occupational Title and Industry with the group number from the 
previous section. Thus, it is possible to find from this Occupational 
Index the Key if one knows only the Occupational Title or the Occupa- 
tional Title from the Key given in the Reference Section. 

The first part of the JPR is divided into a description of the JQI, 
including a sample inventory all filled out, a Manual for Counselors, and 
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a Manual for Personnel Directors. It is from these sections a reader can 
get the philosophy of the author of this instrument. Direct quotations 
will be taken from this section to illustrate this philosophy. 


Relationship of Ability to Interests 


Probably one of the most revealing statements of the author of this 
instrument, if we are examining his philosophy, is as follows: “The re- 
cords of wartime employers and government industrial training programs, 
for example, seem to show definitely that practically anyone, not lacking 
in the requisite intelligence and physical requirements, can learn any trade 
or profession if the desire or necessity is sufficiently strong’ (5, p. 19). 
This is in line with his statement that “A truly impelling interest almost 
always assures ultimate proficiency” (5, p. 18). In such a case only in- 
terest tests would be needed! To support this statement he gives the 
illustration of the accountant, who in twenty minutes, was taught to 
perform a skilled operation that was estimated by plant foremen to re- 
quire at least two weeks intensive proparation by an experienced me- 
chanic. He does not say that the aptitude and interests of this accoun- 
tant were like those of successful account=nts in general and that he lacked 
aptitudes and interests in mechanical fields. He, in a sense, invalidates 
his own Inventory by this claim, in fact, because this person probably 
would have received a high score on the “F”’ section of the inventory, 
the Key letter of the Accounting Section. He might not have expressed 
anything other than the fact that he enjoyed mechanical things because 
it is possible he never learned to handle any of them and the subgroups 
H and I, which are the sub-keys of one form of accounting, do not ask 
ask these questions. This may be an extreme, but not improbable, il- 
lustration. 

The author follows this illustration by the illustration of thousands 
of lawyers, accountants, and other professional men who helped in the 
war effort by working in war plants. Because these men are in profes- 
sional work does not rule out the fact that they bave other abilities, ap- 
titudes, and interests. It does not even mean that the fact that they are 
in a profession proves that they are using their first and foremost interests, 
aptitudes and abilities. 

It may be said that the author covers himself sufficiently by the fact 
that his statement includes “requisite intelligence.” If he forecasts the 
further discovery of specialized types of intelligence such as the “‘Q”’ 
and “L”’ scores on the ACE Psychological Examination, then his state- 
ment may have more merit than a first reading would seem to give it 
though this supposition carries one into the future of research on testing. 

Related to this latter matter is the author’s statement that ‘verified 
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interests are likely to be in fields compatible with the individual’s IQ 
and personality” (5, p. 21). It is known that there is a tendency for 
the relationship between measured interests and scholastic aptitude as 
measured by the Q and L scores of the ACE to be in the logical direction. 
However, the correlations are low, though a few of them are statisti- 
cally significant. A great deal of research would have to be done on this 
subject before such a broad statement as is made by the author of the 
inventory could be verified, assuming that that is what he means by the 
statement. 


Relationship of Personality to Interests 


In addition to the statement made above concerning the relationship 
of ability to interests, the author of the Inventory also states that ‘‘Per- 
sonality also may be expected to harmonize with verified interests’ (5, 
p. 21). Hestated earlier that “Introverts annoyed at their own tendency 
toward self-effacement, turned with determination to extrovert occupa- 
tions and became successful lecturers and salesmen” (5, p. 20). Accord- 
ing to his inventory, measured interests would thus change. Would per- 
sonality also be changed? 

It is doubtful whether we are justified in making any very broad gen- 
eralizations on the basis of facts we have now concerning this subject. 
Portions of certain interest tests are similar to our questionnaire type of 
personality tests. Not much research has been done on the relation- 
ship of interest tests to personality as measured by projective techniques. 
Actually, when results of research are available, we may find a large 
amount of overlap between measured interests and measured personality 
though different measuring techniques are used. If, as it looks now, tech- 
niques are overlapping, it may be expected that results will be related 
and the statement is redundant. 


Permanency of Interests 


“In its treatment of vocational interests, the Job Qualifications In- 
ventory differs from all the other measuring instruments . . . in that 
it is concerned only with sustained, verified interest, and not mere ex- 
pression of preference at the moment of testing’”’ (5, p. 17). This state- 
ment seems to presuppose that interests as measured by the usual type 
of vocational interest inventory are essentially not of a permanent nature. 
This matter has been the subject of a good deal of research. This research 
would seem to indicate that interests as measured by instruments which 
are demonstrated to be reliable and valid to the extent that we have been 
able to demonstrate validity, are relatively permanent. The reason for 
this permanence is a moot question. Few psychologists are willing to say 
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that an individual’s interests cannot sometimes be changed or developed if 
the situation is favorable (1, 4). However, it does seem to be a fact that 
the usual amount of change is not great (3). 

We may use Strong’s statement to indicate the extent to which he 
considers that research has indicated that interests as measured by the 
Strong Vocational Interest Blank are permanent. Quoting a research 
study he states, “Roughly there are 45 chances in a hundred that the 
same rating will be obtained on retest within five years, 75 chances that 
the same rating or one just above or below it will be received, and 90 chances 
that the second rating will be within two steps of the first rating’’ (3, p. 
366). He goes on to analyze the reasons for the changes: “‘Part of the 
change is caused by forcing responses into one of three categories . . ., 
part is caused by increasing maturity common to men in general, and 
part is caused by true shifts in interests because of individual experience. 
When these factors can be disentangled, it is confidently expected that 
true changes in interest-test scores because of lack of permanence will 
be found to be relatively slight” (3, p. 368). And later he says after ex- 
amining the permanence of an individual’s interest profile, “The presence 
of a few negative and low positive coefficients stresses the fact that sta- 
bility of measured interests is not universal” (3, p. 373). This state- 
ment is in general agreement with the one given earlier, and is a recog- 
nition that he is not a supporter of absolute permanence of measured 
interests. 

The extent to which interests are hereditary, or are largely developed 
by environmental forces has never been demonstrated (3, p. 680), (4). 
The author of the JQI raises this question and answers it by giving his 
opinion but deos not verify it from a research point of view: “Is interest 
supported by the necessary mental ability and physical characteristics, 
to be the sole criterion?” (5, p. 19) and in answer to his question of what 
we shall test for, he concludes that “aptitude is a combination of inter- 
related conditions—both internal and external—which influence an in- 
dividual’s behavior. And the weight of evidence points to the external 
or environmental factors as being most important” (5, p. 19). 


Reliability of Interest Test Scores 


Reliability of the JQI, in the usual statistical sense, has not seemed 
to worry the author. He does say that this Inventory is nct a test, it is 
a record of fact. He suggests that in scoring the Inventory, the facts 
recorded by the person filling it out be verified. However, if the author’s 
statement is true that “the Inventory allows httle opportunity for ex- 
cessive optimism or conservatism, misinterpretation, or self-deception” 
(5, p. 22) then reliability would be implicit in the instrument. But surely 
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this would make the instrument less useful as a counseling instrument, one 
which measures vocational interests of the counselee, for it would add 
little to the data already available in the records of the student. 

The author comes back to “verified” interests when he says “. . . 
unsupported preferences are often unstable, and desire for fame and for- 
tune finds expression in all sorts of ‘ambitions’ which never pass the stage 
of wishful ‘thinking’ (5, p. 17). Here the question would have to be 
asked: Are the preferences measured by the usual type interest inventory 
unsupported? If so, from whence come our reliability coefficients of as 
high as .98, but usually clustering about .90? No reliability coefficients 
are given for the scores as given by the JQI! Our experience has led us 
to think that statistical reliability might not be too high for the usual 
“key” is made up of from four to six letters which means that the score 
was based on from about 20 to 36 affirmative answers. Taking into ac- 
count the fact that some negative selection also went on, this number 
would be raised somewhat. Kuder’s point that all answers on the usual 
type of preference inventory are interrelated (2, pp. 484—487) is also per- 
tinent here. Perhaps we should not be too pessimistic. But it is our 
duty to ask for proof! 


Validity 
The author comes back time and again to “‘verified”’ interests, “‘inten- 


sity or continuity of interests,” “‘a truly impelling interest’”’ which “‘almost 
always assures ultimate profiency” (5, p. 18), which would seem to in- 
dicate one of two things: this inventory is most appropriate for use in 
selecting employees where skill for the job and secondarily interest in the 
job is the factor being examined from a record of facts, or that it is most 
appropriate for use with older persons who have had a chance to “‘verify”’ 
or gain proficiency in some fields. The author discusses this matter. 
He places the use in business and industry as the first purpose of the in- 
ventory, and the second “‘to assist in guiding inexperienced youth toward 
educational and occupational objectives which are most likely to be per- 
manently suitable” (5, p. 4). He does not indicate that because he men- 
tioned its use in industry first, that this is the most important use of it. 
He indicates that the use of the inventory with youths will produce a 
lower level profile than the profile of more mature people. 

One test of validity of a vocational interest inventory has been the 
extent to which the items sample broadly types of experiences which 
bear on the crystallization of interests in a specified occupation or career. 
This inventory seems to put the main stress on whether there has been 
training, experience, or recognition of the person’s ‘‘unusual profic ency”’ 
rather than sampling the experiences which bear on the crystallization of 
interests in a specified occupation. In other words, it is doubtful whether 
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this inventory measures “‘interests’’ in the sense that one needs to measure 
them to aid the individual in the counseling process whether for the level 
of further education, placement or change of employment. If it is used 
for this purpose it would seem to be open to what Fryer calls the infor- 
mation error, the items being too specific in nature to allow persons to 
have many of the experiences. In other words, the claim of the author 
that his Inventory covers verified interests in “thirty-five occupational 
areas which represent the entire range of human knowledge and endeavor”’ 
(5, p. 22) would seem to need further justification. 

The author of the Inventory seems to presume there will be questions 
raised concerning the validity of the Inventory. He explains that it was 
administered to “a number of men and women known to be successful 
and happy in their work” and a profile for members of each group pre- 
pared. He feels that the number of persons in each group is not very 
important and does not supply this information for any group. He im- 
plies that the validity of the profile is dependent upon “proven success 
and contentment” in the work, in other words, the importance lies in 
obtaining a homogeneous group of successful and contented persons. 
His N as he mentioned it, seems to have varied from six persons to one 
hundred. He makes no effort to explain how he proves happiness and 
success. He implies that if there were variations in a profile, they were 
investigated and found to be due to lack of success or absence of con- 
tentment. He admits not all profiles were so verified. Often it was 
necessary to take the facts from interview and other types of records of 
industries employing persons in the various classifications. Data from 
various samples were compared for consistency, however. 

The author suggests that wherever a profile is needed which is not 
included in the JPR, the user can construct such a profile and gives in- 
structions for doing so. He also suggests to college counselors that they 
“‘will perform an inestimable service by administering the Job Qualifi- 
cations Inventory to selected homogeneous groups of successful seniors 
who are obviously happy in their choice of career. Inventory profiles 
within each group will be very similar in their high-ranking elements, and 
a composite profile of the group can be used as a criterion for use with 
entering freshmen, and perhaps made generally available to high school 
counselors” (5, p. 23). 


Summary and Conclusions 


Certain criteria of the adequacy of an interest test should be kept in 
mind by those interested in its use. If a test does not meet most of these 
standards, the test should be used only on an experimental basis until 
data for judging its adequacy to function in the situation where it is to be 
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used can be determined. These criteria do not differ greatly from the 
criteria by which other tests are chosen, but in certain details they do 
differ. Therefore, it is probably well to review them in brief before draw- 
ing conclusions concerning the value of the JQI and the JPR. 


1. Scores on an interest inventory should be as reliable for use with 
individuals as are scores on other types of tests (approximately .85 and 
above), and the extent of consistency of response should be measured by 
follow-up studies. 

2. The real test of the validity of any test is based on whether it does 
what needs to be done in the situation in which it is to be used. The 
best test of validity for the usual use to which an interest test is put is 
“‘success’”’ or degree of satisfaction in a vocation. The time element in- 
volved in addition to the difficulty of defining ‘‘success’”’ makes this cri- 
terion a most difficult one against which to check the validity of a test. 
There are certain other criteria, however, which may throw light on the 
validity of an interest test. They are: (a) intercorrelations of part 
scores; (b) relationship of interests as measured by the test to sex and 
age; (c) relationship of scores on the test to objective measures of gen- 
eral or scholastic aptitude, achievement in various subject matter areas, 
other instruments purporting to measure interests, vocational choice 
either stated preference or actual entrance into a vocation, and “‘success”’ 
or “degree of satisfaction in chosen educational work”; and (d) broad 
sampling of the types of experiences bearing on the crystallization of 
interests. 

3. Standardization data which indicate clear differentiation between 
persons in the several occupational areas or specific occupations should 
be available. When a test is first put on the market, it may be expected 
that these data may be more meagre than ideally desired. However, 
data which are available should be clearly labeled as to source, number, 
etc. Users should be clearly labeled as to source, number, etc. Users 
should know exactly what is available when they review the background 
of the test before choosing it for use. 

4. Other information concerning the test should also be checked by 
users: (a) format; (b) adequacy of instructions given in the manual on 
interpreting and scoring the test; (c) original cost of test as compared to 
other tests available which furnish like results; and (d) cost and difficulty 
of scoring the test. 


In summary, judging by these criteria, this critique would seem to 
lead to the following conclusions: 


1. Only by extensive research into the relationship of scores on this 
Inventory to vocational and education “success” and other techniques 
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or instruments which measure aptitude, intelligence, interests, and per- 
sonality, could many of the claims made by the author be verified. 

2. The author’s claims for this technique of measuring “interests” 
awaits further validation. As the record stands the Inventory would 
seem to be more similar to an application blank or an oral trade test than 
to the usual type of interest test. 


Received October 24, 1946. 
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Book Reviews 


Fox, J. B., and Scott, J. F. Absenteeism: Management’s problem. Cam- 
bridge: Harvard University Graduate School of Business Administra- 
tion. 1943, Vol. XXX. $1.00. 

Mayo, E., and Lombard, G. F. F. Teamwork and labor turnover in the 
aircraft industry of Southern California. Cambridge: Harvard Uni- 
versity Graduate School of Business Administration. 1943, Vol.X XX. 
$1.00. 


These publications suggest the approach to the problems of absentee- 
ism and labor turnover in which absenteeism’s connection with external 
conditions of housing, transportation and sickness is only an indirect one. 
Absenteeism is considered a direct symptom of the general state of health 
of the internal organization and its conditioning of the workers’ responses 
to these outside factors. Specifically the type of industrial administra- 
tion that takes into account human relations islauded. The second study 
confirms and supplements the first in describing more extensively the for- 
mation of teams, a natural outgrowth of the administration fully aware of 
human motivation. These teams ideally take over from first-line super- 
vision much of the task of maintaining communieation with the workers. 
Such a point of view tends to absolve the worker from a large share of the 
moral blame hitherto iaid to him for absentee and turnover records. 

Absenteeism: Management’s Problem in demonstrating the desired 
approach, above, relates of a case study carried on in the casting shops 
of three different companies whose workers were subject to similar out- 
side factors. The casting shop of Company C boasted an absenteeism 
record superior to the casting shops of Companies A and B over the same 
period of emergency war production. Several phases of C’s internal 
organization loomed as significant—their induction and training program 
for new employees, their group incentive payment system, a large dele- 
gation of responsibility to the workers, their training of supervisors in 
human relations and their imposition of rather severe penalties for ab- 
sences on key days. Perhaps most valuable of all was the ability of 
C’s management to anticipate outside changes and influences threatening 
rises of absenteeism. 

Several implications for management are stressed as a result of the 
Mayo-Lombard study. These are relative to the possibilities of achiev- 
ing a balanced relationship between organization of operations and ap- 
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plication of skill and science on the one hand and organization of team- 
work on the other. The desire of people for association with others is 
deep-seated and inevitable and should not be neglected or thwarted by 
management. Hence the question on teamwork is not whether teams, 
but what kind. Will they be hostile or wary of wholehearted cooperation 
or will they be cooperative and friendly? The possibility of effecting 
transfers of teams rather than of individuals is hinted at, and the sugges- 
tion made that the frequent practice of selecting as workers’ represen- 
tatives the high producers should give way to the selection of team 
; leaders. Finally, the authors endorse group incentive payment plans, 
considered in the light of the actual relations of persons at work, to the 
end that the formation of teams will be promoted. 

Although these investigations are based upon wartime data, they are - 
considered by the authors to have no less significance for the reduction of 
peacetime absenteeism and labor turnover. And while the effectiveness 
of the various fcregoing methods in the formation of industrial teams 
could not be widely demonstrated by the authors, it was their hope that 
management would begin to realize the advantages to be derived there- 
from. 

It is felt by the reviewer that the subject of organized labor, although 
not dealt with, is a pertinent one in these studies. Another question 
mark is the noticeable trend in past years away from group incentive 
plans. Finally, one wonders at the possibility, unmentioned in the 
studies, of utilizing selection and placement procedures for obtaining 
not only regular attenders but workers high in the traits that make for 
good team members or leaders. 


C. H. Lawshe, Jr. 
Purdue University 


Woodward, Luther E., and Rennie, Thomas A. C. Jobs and the man. 
Springfield, Illinois: Charles C. Thomas, 1945. Pp. vii + 132. $2.00. 


By the time the review of this book, written in anticipation of the 
problems of re-employment, shall appear in print the veterans’ rush back 
to civilian jobs will be largely over. Problems of occupational adjust- 
ment of the veteran will not differ in any fundamental respect from those 
of the rest of the population. What value will the book retain? Pro- 
fessiona! workers who have heard of the work of the Rehabilitation Clinic 
of the New York Hospital, under the direction of Dr. Rennie, expected 
an important factual and methodological contribution to the clinical ap- 
proach to occupational readjustment by a team of specialists including 
physicians, psychiatrists, psychologists, social workers, and occupational 
analysts. They will be disappointed. 
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The manual is said to be directed to those who will be employing, 
supervising and counseling individuals with emotional problems. This 
naturally involves a large number of industrial personnel but no attempt 
has been made to either survey the present level of their training in clin- 
ical personnel work or to develop an ideal set-up, characterized in terms of 
skills and responsibilities, as a model for the future. Although in general 
the material is directed toward the layman’s point of view and the com- 
mon denominator of understanding, the text is by no means uniform in 
complexity or relevance of the material. 

We learn that the veterans should be placed in the right kind of jobs 
and that “understanding” veterans who come back “nervous” helps a 
lot. Also, they should be “treated helpfully” on the job. Except for 
the chapter on the administrative aspects of establishing veterans’ em- 
ployment programs, exemplified by the re-employment procedures of 
General Motors and Owens-Illinois Glass Co., there remains only one 
chapter, dealing with the techniques of industrial interviewing. The 
reader finds, in addition to other platitudes, that “Some interviews are 
simple, others are decidely complex” (p. 66). Fortunately enough, the 
prospective counselor is given seven points to look for and to emphasize, 
and twenty-two additional do’s and don’t’s in interviewing and coun- 
seling. ... 

Even the end is not a happy one. In fact the glossary closing the 
book is a very poor specimen in lexicography. It would be unfortunate 
for any personnel men to learn, on the basis of the combined authority 
of a Ph.D. and an M.D., that feebleminded individuals form that part of 
the population who have a lower than average intellectual equipment. 


Josef Brozek 
Laboratory of Physiological Hygiene, 
University of Minnesota 


Cunningham, Bees V. Psychology for nurses. New York, D. Appleton- 
Century Company, 1946. Pp. xx + 336. $3.00. 


The primary purpose of this textbook, as stated by the author, is to 
make psychology an immediately useful science for students in hospital 
schools of nursing. In the opening chapter psychology is defined, the 
need for scientific methods and attitudes is discussed, and the methods 
of science described. Attention is then immediately directed to the 
problems of the student herself, her reasons for entering nursing, the 
habits and attitudes which she brings with her, and the importance of 
understanding her patients as individual human beings. As a basis for 
understanding why individuals differ, the nature of heredity and of envi- 
ronment is presented, and the interaction between the two discussed. 
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Under the heading ‘‘Mainsprings of Action” the author descnives basic 
physiological drives, motives which emerge as a child develops, and needs 
which seem common to the majority of human beings. Material on 
learning is presented under chapter headings of “‘Learning How to Study,” 
“Learning to Think and Reason,” “Social and Indirect Learnings,’”’ and 
“Emotional Learnings.” In the discussion of reactions to strain and 
frustration, the nature and causes of stress and strain are presented, and 
the various adjustment mechanisms briefly described. The chapters 
on personality emphasize the uniqueness of personality makeup, and the 
methods which are commonly used to measure aspects of personelity. 
In the final chapter the author directs the attention of the student away 
from her more or less immediate adjustments to her future function as a 
professional nurse and a citizen who will participate in far reaching social 
programs. 

Though the author has succeeded admirably in carrying out her pur- 
pose of making psychology an immediately useful science for students 
in hospital schools, minor criticisms of the book may be made. The 
attempt to distinguish between motives and common human needs seems 
to lead enly to a repetition of some of the same material under different 
headings. In the chapters on learning, no attempt is made to define 
learning as such, or to discuss theories as to how learning actually takes 
place. Instead of directing attention, in the final chapter, to the need for 
participation in far reaching social programs, more attention might have 
been given to the problem of establishing democratic government within 
the school of nursing, and the importance of each individual’s contri- 
bution to the development of a wholesome environment for the entire 
group. 

The many outstanding qualities of the book, however, far outweigh 
minor criticisms. It seems far superior to any book which has been 
written in this area. 

The summaries at the close of each chapter and the suggested act- 
ivities which are designed to stimulate thinking and action are of great 
value. The material on the development of hgbits, standards, and at- 
titudes, and their importance in determining the adjustments which an 
individual will make in nursing is of particular significance at this time. 
The importance of childhood experiences in making immediate adjust- 
ment problems more understandable is stressed throughout the book, and 
the uniqueness of personality is constantly emphasized. With supple- 
mentary readings, the book might well be used by students in collegiate 
as well as hospital schools, or by graduate nurses with an inadequate 
background in psychology. 

7 Helen Nahm 


Duke University 





Book Reviews 549 


Weston, H. C. The relation between illumination and visual efficiency— 
the effect of brightness conirast. Industrial Health Research Board 
Report No. 87. London: His Majesty’s Stationery Office, 1945. 
Pp. 35. 9d. 

This report deals with two investigations on the effects of brightness 
contrast in relation to visual efficiency and was intended to test out the 
suggestion of Beuttell that if the relationship could be ascertained be- 
tween size, contrast and brightness for satisfactory visibility, then the 
illumination suitable for the performance of any task ought to be capable 
of computation. In the first study the task was to cancel Landolt broken 
rings having a given gap orientation under illumination ranging from 
0.8 to 500 foot-candles and contrasts between print and paper ranging 
from 0.36 to0.91. Three sizes of gap were employed—1, 3 and 6 minutes. 
The second investigation was similar to the first except that the illumina- 
tion ranged from 0.5 to 512 foot-candles, brightness contrast from 0.25 
to 0.97 and gaps in the Landolt ring were 1.5, 3.0 and 4.5 minutes. 

The main results include the following: (1) With constant illumination, 
performance did no+ vary for large sized gaps and for contrasts of 0.68 
and above. (2) For small sized gaps, the illumination for maximum per- 
formance varied inversely and the maximum performance varied directly 
with the contrast. (3) Performance depended upon contrast presented 
by the task even when different contrast tasks are given different illumi- 
nations so that each task presents the same absolute brightness difference. 

These investigations were well designed and the analyses yield im- 
portant data concerning the effects of brightness contrast, size of detail 
to be discriminated, and illumination intensities upon visual efficiency. 
The author failed to emphasize that, except for very fine details and very 
low brightness contrast, there was little gain in efficiency of response for 
illumination intensities above 10 to 15 foot-candles. This suggests that 
computing the illumination necessary for adequate visual discrimination 
in reading commonly used print would turn out to be approximately 10 
to 15 foot-candles since the visual task in such reading is similar in terms 
of size of detail and brightness contrast. Beuttell’s suggested technique 
is promising enough to warrant further investigation and evaluation. 
Unfortunately, the author did not follow through and indicate to what de- 
gree Beuitell’s suggestion was validated, nor discuss difficulties involved 
in such a procedure for prescribing illumination. 


Miles A. Tinker 
University of Minnesota 
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Az-zarnuji. Instruction of the student. Theodora M. Abel and G. E. Von 
Grunebaum. New York: King’s Crown Press, 1947. Pp.78. $2.00. 

The psychology of rumor. Gordon W. Allport and Leo Postman. New 
York: Henry Holt and Co., 1947: Pp. 247. $2.60. 

A psychology of growth. Bert I. Beverly. New York: McGraw-Hill 
Beok Co., Inc., 1947. Pp. 235. $2.50. 

Studies in gunits. Walter G. Bowerman. New York: The Philosophical 
Library, 1947. Pp. 343. $4.75. : 

The H-T-P. A projective and a measure of adult wnitelligence. John N. 
Buck. Colony, Va.: John N. Buck, Lynchburg State Colony, 1947. 
Pp. 94. Limited free distribution to qualified clinical psychologists. 

Psychology of childhood and adolescence. Luella Cole and John J. B. 

Morgan. New York: Rinehart and Company, Inc., 1947. Pp. 416. 
$3.50. 

Brain and body weigni in man: their antecedents in growth and evolution. 
Earl W. Count. New York: The New York Academy of Sciences, 
1947. Pp. 129. $2.00. 

Utilizing human talent. Frederick B. Davis. Washington, D. C.: Amer- 
ican Council on Education, 1947. Pp. 85. $1.25. 

The psychology of everyday living. Ernest Dichter. New York: Barnes 
and Noble, Inc., 1947. Pp. 239. $2.50... 

A basic text for guidance workers. C.E. Erickson. New York: Prentice- 
Hall, Inc., 1947. Pp. 566. $4.25. 

Counseling in schools of nursing. H. Phoebe Gordon, Katherine Dens- 
ford, and E. G. Williamson. New York: McGraw-Hill Book Co., 
Inc., 1947. Pp. 279. $3.00. 

Guide to occupational choice and training. Walter J. Greenleaf. Wash- 
ington, D. C.: Superintendent of Documents, U. 8. Government 
Printing Office, 1947. Pp. 150. $.35. 

Practical psychiatry and mental hygiene. Samuel W. Hartwell. New 
York: McGraw-Hill Book Co., Inc., 1947. Pp. 325. $3.00. 

It’s up to you. Seward Hiltner. New York: Association Press, 1947. 
Pp. 32. $.10. 
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Diagnosis of antisemitism. Twoessays. Gustav Ichheiser. New York: 
Beacon House, 1946. Pp. 27. $1.50. 

Problems of physiological psychology. J.R. Kantor. Bloomington: Prin- 
cipia Press, Inc., 1947. Pp. 415. $5.00. 

Defining prestige in a rural community. Harold F. Kaufman. New 
York: Beacon House, 1946. Pp. 26. $1.50. 

Hypnotism today. Leslie M. LeCron and Jean Bordeaux. New York: 
Grune and Stratton, Inc., 1947. Pp. 278. $4.00. 

Children of the people. Dorothea Leighton and Clyde Kluckhohn. Cam- 
bridge: Harvard University Press, 1947. Pp. 277. $4.50. 

Handbook of correctional psychology. Robert M. Lindner and Robert V. 
Seliger. New York: The Philosophical Library, Inc., 1947. Pp. 691. 
$10. 

Twenty years of merit rating. Walter R. Mahler. New York: The Psy- 
chological Corporation, 1947. Pp. 73. 

Indians before Columbus. PaulS. Martin, George I. Quimby, and Donald 
Collier. Chicago: The University of Chicago Press, 1947. Pp. 582. 
$3.00. 

The analysis and control of human experiences. VolumesI and II. Paul 
Maslow. Brooklyn: Paul Maslow, 1946-47. Pp. 195 and Pp. 229. 
$3.50 each. 

The work, training and status of supervisors as reported by supervisors in 
industry. Bruce V. Moore, J. Ewing Kennedy, and George F. Cas- 
tore. State College, Pa.: The Pennsylvania State College, 1946. Pp. 
31. $1.00. 

Psychological testing. Theoretical and practical. James L. Mursell. 
New York: Longmans, Green and Co., Inc., 1947. Pp. 480. $4.00. 

How to prepare a foreman’s policy manual. R. C. Oberdahn. Deep 
River, Conn.: National Foremen’s Institute, 1946. Pp. 145. $7.50. 

Descriptive and sampling statistics. John G. Peatman. New York: Har- 
per and Brothers, 1947. $5.00. 

Nurse-patient relationships in psychiatry. Helena W. Render. New 
York: McGraw-Hill Book Co., Inc., 1947. Pp. 346. $3.00. 

Characteristics of adolescence. Dorothy M. Schnell. Minneapolis: Bur- . 
gess Publishing Co., 1946. Pp. 68. $1.00. 

All of us have troubles. Harold Seashore. New York: Association Press, 
1947. Pp. 50. $.25 single copy. $2.25 dozen copies. 

The psychology of ego-involvements. Muzafer Sherif and Hadley Cantril. 
New York: John Wiley and Sons, Inc., 1947. Pp. 525. $6.00. 

Elementary educational psychology. Charles E. Skinner, Editor. New 
York: Prentice-Hall, Inc., 1946. Pp. 440. $3.50. 
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Multiple-factor analysis. L. L. Thurstone. Chicago: The University 
of Chicago Press, 1947. Pp. 535. $7.50. 

How to use cumulative records. Arthur E. Traxler. Chicago: Science 
Research Associates, 1947. Pp. 40. $.50. 

Developing insight in initial interviews. Alice L. Voiland, Martha Lou 
Gundelach, and Mildred Corner. New York: Family Service Associa- 
tion of America, 1947. Pp. 54. $.60. 

Theory of games and economic behavior. John Von Neumann and Oskar 
Morgenstern. Princeton: Princeton University Press, 1947. Pp. 
641. $10. 

What is psychology? Werner Wolff. New York: Grune and Stratton, 
Inc., 1947. Pp. 410. $4.00. 

Psychology. Fifth edition. Robert 8S. Woodworth and Donald G. 
Marquis. New York: Henry Holt and Co., 1947. Pp. 677. $3.25. 

Counseling young adults. A symposium. New York: Association Press, 
1947. Pp.40. $.75. : 
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